• Advertisement
Sign in to follow this  

password protected files

This topic is 4258 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

In my application, a file packer (think winzip), i'm trying to add encryption. The encryption is going to require a key (password) to make sure the encryption method is unique. While i can think of 1000 different ways to encrypt the data i can't think of any way to somehow hide the password in the packed file without it being easy to recover by people looking to decrypt the data. I need to be able to tell if the user entered a correct or incorrect password during the unpacking process becuase the data will also be decompressed, and I need the data to be in it's unencrypted state to decompress correctly. To determine if the password entered by the user is correct, i've though of: Placing pieces of the password throughout the packed file, for instance place a character of the password every X bytes. During unpacking the user's password would be compared against these bytes. As an alternative to hiding the password, i've thought of having a segment of bytes of all the same value at the beginning of the file. During packing, these bytes will be encrypted and if during decryption the bytes don't turn out to be all the same then the password is invalid. And a third method is to combine the above two. I know that it is impossible to store the password in the file without it being recoverable by someone else, but the methods i mentioned seemed too niave to me. Does anyone have any suggestions or alternatives to hiding a password in the packed file so that i can immediatly tell if it matches the one the user enters?

Share this post


Link to post
Share on other sites
Advertisement
Easy, don't save the password. Use an encryption method that can undo itself by applying the same encryption method to the data, i.e. original_data=Decrypt(encrypted_data,password)=Encrypt(Encrypt(original_data,password),password). Just store a hash of the original data with the file, and use it to verify the decrypted data.

A simple way to do this is to use a hash of the password to seed a customized PRNG, and then simply XOR the output of the PRNG with the data. The result will be encrypted data, and if you then perform the exact same process on the encrypted data, the original data will be returned.

[Edited by - Mastaba on June 25, 2006 11:14:39 PM]

Share this post


Link to post
Share on other sites
Mastaba: i semi-understand what you're saying. From what i found out (see sentence below) hashing is a good method for encryption, but it seems unreasonable to have to store hashes for each file in the packed file when i can just store one hash for the password.

Undergamer: Yes that's quite a long list! I was able to come across this, however, and in the article it reccommends hashing the password in the file, and claims good hashes as being virtually unbreakable by modern day computers.

Since hashing seems to be what's reccommended i'll research it some more. Thanks to you both!

Share this post


Link to post
Share on other sites
Storing the hash of the password is as good as storing the password itself... if the user can attach a debugger to your application he can simply replace the hash value of the password he entered with the hash value he got from the file.

EDIT: It seems I misread your original question, so storing a hash might work in your case, because you're using the password for decrypting data. It wouldn't work where you rely on the hash for authentication.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Mastaba
A simple way to do this is to use a hash of the password to seed a customized PRNG, and then simply XOR the output of the PRNG with the data. The result will be encrypted data, and if you then perform the exact same process on the encrypted data, the original data will be returned.


Nevermind the fact that an encryption method like that could probably be cracked with a pencil and paper, it's a pretty bad encryption method, especially considering there are tried and true encryption methods you can use for free.

As for the original question - one way of doing it is storing a hash (preferrably salted) of the password inside the encrypted block. If you want a fast method, store it as the first item inside. If you don't mind a slower, more secure, method - don't store anything at all, except possibly a hash of the unencrypted data (inside the encrypted block itself, of course). If the hash doesn't match the unencrypted data block after attempting to decrypt, you know the password was wrong.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
[quoteNevermind the fact that an encryption method like that could probably be cracked with a pencil and paper, it's a pretty bad encryption method, especially considering there are tried and true encryption methods you can use for free.[/quote]

Really? With pencil and paper? So, this method, which is very similar to RC4 could be cracked by someone without a machine? In that person's lifetime? Would you dare to take a crack at decoding a string a bytes for me with pencil and paper? I'm challenging you Anonymous Poster. I'll post a string of encrypted bytes using this type of method that 'can be cracked with pencil and paper'. I will gladly even give you a generous 3 month deadline to crack it. So what's it going to be Anonymous Poster? Are you man enough?

Share this post


Link to post
Share on other sites
Hehe, I forgot to put my password. So now I can't even edit the quote tag.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Anonymous Poster
Really? With pencil and paper? So, this method, which is very similar to RC4 could be cracked by someone without a machine? In that person's lifetime? Would you dare to take a crack at decoding a string a bytes for me with pencil and paper? I'm challenging you Anonymous Poster. I'll post a string of encrypted bytes using this type of method that 'can be cracked with pencil and paper'. I will gladly even give you a generous 3 month deadline to crack it. So what's it going to be Anonymous Poster? Are you man enough?


Tell you what, to demonstrate that your method is insecure, I propose the following:

1) Disclose the method you used exactly.
2) Take a jpeg file, encrypt with your method.

After that, I will attempt to crack it. Of course, if I fail, you have to reveal the password to prove you didn't post garbage.

All of the commonly used freely available encryption algorithms (AES, Blowfish, etc.) would easily pass that test, but the method you described would probably not. If the method is like you described, I promise I'll do the crack without using the computer to perform any computations to do it (that is, a "pencil and paper" crack).

How about it?

Share this post


Link to post
Share on other sites
I don't understand why all this fuss about encryption modes or hashing.

OP: use an irreversible hash function (sha1/md5/other) on the password and store that hash in an uncrypted section of the file. Then, use another hash function to create a key, and use that key to encrypt the data using any private-key algorithm you wish. When the user attemps decryption, hash the password and check it against the hash stored in the uncrypted section. It it matches, attempt to use the other hash as a key. If it does not match, or if the other hash is not a valid key, tell the user he's wrong.

Mastaba: using a hash as a seed for a PRNG and generating a number is the same as using a hash, only longer.

Share this post


Link to post
Share on other sites
The first AP possibly misinterpreted it as using a password-seeded PRNG to generate a random number and then XORing every block with that number, as opposed to using the PRNG to generate an approximation to a one-time pad (i.e. a pseudo-random stream of numbers, of the same length as the message).

But the latter still doesn't seem especially good - if you encrypt two messages with the same password (hashed to get the PRNG seed), they'll have the same pseudo-random stream (so it's a two-time pad), so you just XOR the ciphertexts together to get the XOR of the plaintexts, and then it's relatively easy to recover some of the original text even though you don't know the password. And if you know one plaintext, you can XOR it with the ciphertext to get the output of the PRNG, and thus recover the plaintext of any other message you have that uses the same password.

(Those are (I think) easy to protect against if you start by generating a random number, storing that in the output file, and XORing that with the hashed password to get the PRNG seed, so that no two messages will use the same stream of random numbers for the encryption. But that still doesn't prevent somebody who knows a plaintext/ciphertext pair from creating their own message using the same password even if they don't know the password, since they're free to reuse the known message's seed value. And it relies on having a good PRNG, with a long enough period that it'll never get stuck in a cycle (which I'd guess requires a period of at least the square of however much data you're encrypting at once) and presumably other properties that cryptographers are happy with, and then you might as well just use an entire encryption algorithm and high-quality library implementation that cryptographers are already happy with [smile]. (And then store the salted hash of the password so you can tell users when they've typed it in wrong, using a hash function that's sufficiently hard to brute-force.))

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Excors
The first AP possibly misinterpreted it as using a password-seeded PRNG to generate a random number and then XORing every block with that number, as opposed to using the PRNG to generate an approximation to a one-time pad (i.e. a pseudo-random stream of numbers, of the same length as the message).


Nope.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
This is like the blind leading the blind.

Share this post


Link to post
Share on other sites
If you're not experienced in cyrptography, you have no business trying to invent an encryption algorithm that you expect anybody (including yourself) to actually use.

There are plenty of algorithms (and implementations) available that have been examined by experts and found to work well. Use one of those, which is infinitely more secure than anything you'll think up without experience in the field (even though they require significant assumptions in order to be secure).
One such algorithm is AES (Rijndael).

Also, be aware that simply applying an encryption function to every block of data is NOT a secure way to encrypt your data. Look into block cipher modes of operation for the proper way to apply encryption algorithms.

Share this post


Link to post
Share on other sites
Here is an implementation of the simple method I described above, along with a jpeg file that has been encrypted using the included executable. If you really think this method is breakable with pencil and paper, then be my guest, have a go at decrypting the image. I expect this method to be crackable within a couple days given significant cpu cycles, much like RC4-40 was (since this has some similarity to it).

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Mastaba
Here is an implementation of the simple method I described above, along with a jpeg file that has been encrypted using the included executable. If you really think this method is breakable with pencil and paper, then be my guest, have a go at decrypting the image. I expect this method to be crackable within a couple days given significant cpu cycles, much like RC4-40 was (since this has some similarity to it).


Not good enough. The first part of my offer was "Disclose the method you used exactly" - this means source code. Source code is not included in your release.

Share this post


Link to post
Share on other sites
the AP's test is a good idea. And I agree your method will be breakable in two seconds (not neccessarily with pencil and paper, maybe).

The reason you need to release the source to make this a valid test is because anyone with a copy of your program and olly will be able to reverse engineer it in two seconds. Also, I recall reading a very interesting paper a while back about the 'philosophy' of encryption. Basically, because your source is readily available to anyone with a debugger, it should be considered 'public information'. Your algorithm could be trivially broken by someone with this information, so in reality the source is part of the 'key'(private information) and should be treated as such.

Hope that made some sense.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mastaba
Here is an implementation of the simple method I described above, along with a jpeg file that has been encrypted using the included executable. If you really think this method is breakable with pencil and paper, then be my guest, have a go at decrypting the image. I expect this method to be crackable within a couple days given significant cpu cycles, much like RC4-40 was (since this has some similarity to it).
You have a bug in your code - the function that calculates the CRC32 of the password appears to take the string length as a parameter, but the length passed is the number of characters (each 2 bytes since you use unicode) while the function seems to use it as if it were the number of bytes. Thus, the password hash is only the first half of the string (with every character followed by a NUL character).

[Edited by - Extrarius on June 27, 2006 12:20:02 PM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
You have a bug in your code - the function that calculates the CRC32 of the password appears to take the string length as a parameter, but the length passed is the number of characters (each 2 bytes since you use unicode).


Good catch! I forgot to multiply the character count by the width of a character. Fortunately, I used a nice long password on the sample document.

Though it also highlights some other things I would do now, if I actually took time to make it more secure. One, thing is I would change the password to UTF-8 and then calculate the hash. Another is, instead of using a 32 bit CRC, I would use a 160 bit CRC, that would allow me to seed the RNG without doing those dodgy bit manipulations on the 32 bit hash to create the 160 bit seed. I would also of course, store the in the encrypted document a sub-document, a small consistent key (not necessarily constant!) that can quickly be unencrypted to see if the rest of the document should be unencrypted or return a password error. I would also use a RNG that was more obscure than the one I chose to use. I would also obviously use methods to infuriate those with debuggers. [wink]

This excutable as it is now does not add anything to the document, not even a hash of the password. So the unencrypted document is exactly the same size as the encrypted document. As such, this excecutable does not do any checking on the password you give to see if it is correct. Heck, this executable does not know the difference between encrypting and decrypting. It is a purely symmetric algorithim. It is up to the person at the keyboard to determine if the unencrypted document truely is unencrypted.

[Edited by - Mastaba on June 27, 2006 4:09:53 PM]

Share this post


Link to post
Share on other sites
dsfe_decrypt will decrypt any file encrypted with Mastaba's "dsfe" program as long as you can guess the first few bytes (at least 4) of the file. The more bytes you provide, the fewer results you have to look through, but the longer it takes to find the key. The key isn't the password, but is enough to successfully decrypt the file, and a password that would hash to a given key could be computed in under 4 hours according to a few random google results. I did make a function to turn a hash into a password, but I removed the code after it ran for several minutes without success - the key is enough to decrypt files anyways.

To demonstrate the functionality, I've provided a 'template' file (the bytes you provide, must be a multiple of 4 bytes) for the jpeg image format and both the encrypted and decrypted jpeg.

It took ~57 seconds on my Athlon64 X2 3800+ to find the key for the jpeg and decrypt the file using it. This was based on 12 bytes of guessed data - the first 12 bytes of the jpeg header, which are pretty much constant. The program only uses a single thread, and it could be sped up temendously by making it multithreaded since each iteration of the main loop is entirely independant of all others (aside from storing possible keys in a list, which is something that happens fairly rarely even with only 4 bytes of provided data).

Note that this is a brute force attack of the "known plaintext" variety. I'm sure there are more complicated attacks that would be far, far faster, but I'm not an experienced cryptographer and it's not worth becoming one solely to demonstrate that custom algorithms are, as a rule, broken =-)

Some of the changes you mention would defeat this attack, but I'm fairly certain there would be other holes either in the algorithm or the implementation even then (unless you're quite knowledgable about cryptography).

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Extrarius
dsfe_decrypt will decrypt any file encrypted with Mastaba's "dsfe" program as long as you can guess the first few bytes (at least 4) of the file. The more bytes you provide, the fewer results you have to look through, but the longer it takes to find the key. The key isn't the password, but is enough to successfully decrypt the file, and a password that would hash to a given key could be computed in under 4 hours according to a few random google results. I did make a function to turn a hash into a password, but I removed the code after it ran for several minutes without success - the key is enough to decrypt files anyways.

To demonstrate the functionality, I've provided a 'template' file (the bytes you provide, must be a multiple of 4 bytes) for the jpeg image format and both the encrypted and decrypted jpeg.

It took ~57 seconds on my Athlon64 X2 3800+ to find the key for the jpeg and decrypt the file using it. This was based on 12 bytes of guessed data - the first 12 bytes of the jpeg header, which are pretty much constant. The program only uses a single thread, and it could be sped up temendously by making it multithreaded since each iteration of the main loop is entirely independant of all others (aside from storing possible keys in a list, which is something that happens fairly rarely even with only 4 bytes of provided data).

Note that this is a brute force attack of the "known plaintext" variety. I'm sure there are more complicated attacks that would be far, far faster, but I'm not an experienced cryptographer and it's not worth becoming one solely to demonstrate that custom algorithms are, as a rule, broken =-)

Some of the changes you mention would defeat this attack, but I'm fairly certain there would be other holes either in the algorithm or the implementation even then (unless you're quite knowledgable about cryptography).


That was pretty fast you broke it. I hope he sent you the source so you didn't have to waste your time reverse engineering the binary.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Mastaba
I would also use a RNG that was more obscure than the one I chose to use. I would also obviously use methods to infuriate those with debuggers. [wink]


The "to infuriate those with debuggers" shows that you still don't get it. A good encryption algorithm works whether the attacker has the source code or not. If it depends on the algorithm being secret, it's a garbage algorithm.

Share this post


Link to post
Share on other sites
Send the source code? Lord no, the people that broke RC4 didn't have the source code handed to them. Why should I hand that over? Moreover Mr. Anonymous Poster, he didn't do it by pencil and paper. And I already highlighted above some of the things I would do if I was serious about making it stronger. This attack could easily be thwarted by appending an encrypted variable length key to the start of the document, so that the start couldn't be as easily found.

I hope you enjoyed the cookie Extrarius. [grin]

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
The "to infuriate those with debuggers" shows that you still don't get it. A good encryption algorithm works whether the attacker has the source code or not. If it depends on the algorithm being secret, it's a garbage algorithm.


No, I do get it. Should I make the hackers job as easy as possible? Also that was more of a joke. You do see the wink by it yes?

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Send the source code? Lord no, the people that broke RC4 didn't have the source code handed to them.


What? Of course they did! They algorithm is publicly documented and there's even a reference implementation. Sheesh.

-- John

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement