• Advertisement
Sign in to follow this  

Using SHA-1 for small codes

This topic is 4304 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Imagine I have a small 5-byte code which I want to Hash, so that mr X wouldn't be able to find out what the code actually was; even if X had the hash. Trying to hash 5-byte codes by using brute force and comparing the results is simply too hard, so I'm not worried about that. However, is the SHA-1 function still safe for these small codes? I know that it's quite a detailed question but maybe some one knows around here :D I've been pleasantly surprised before at Gamedev :) Bas

Share this post


Link to post
Share on other sites
Advertisement
You have 5 bytes. The resulting hash is going to be 20 bytes. There won't be any information loss (meaning every possible 5 byte value will have a unique 20 byte hash, for information loss to happen, the hash would need to be smaller than the original data), but getting the information back will require brute force.

Brute forcing the original 5 byte value will require 500 billion attempts on average to find the original value, which isn't very much.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by smart_idiot
You have 5 bytes. The resulting hash is going to be 20 bytes. There won't be any information loss (meaning every possible 5 byte value will have a unique 20 byte hash, for information loss to happen, the hash would need to be smaller than the original data), but getting the information back will require brute force.

Brute forcing the original 5 byte value will require 500 billion attempts on average to find the original value, which isn't very much.


which is why you would want to pad the value with noise.

Share this post


Link to post
Share on other sites
Thanks smart_idiot.

After your remark about the 500 billion attempts I have decided to increase the number of bytes in the code to 10.

Share this post


Link to post
Share on other sites
Quote:
which is why you would want to pad the value with noise.


But then it would be impossible to verify the Hash code..

Share this post


Link to post
Share on other sites
I think he's refering to a salt, as is done when hashing passwords. I guess it depends on what is being hashed and why we're hashing it. It probably wouldn't work, because the salt with probably be kept with the hash, and so we would know what the noise was.

Share this post


Link to post
Share on other sites
Are you talking about using a hash of registration information to form some kind of registration key? Knowing the source of the X bytes and the intended use for both those bytes and the hashed value would help us know whether it's good enough or not. If not, we would then be able to suggest alternatives that fit the situation.

Share this post


Link to post
Share on other sites
Quote:
Original post by smart_idiot
I think he's refering to a salt, as is done when hashing passwords. I guess it depends on what is being hashed and why we're hashing it. It probably wouldn't work, because the salt with probably be kept with the hash, and so we would know what the noise was.
I'm sorry, I don't get it. Wouldn't adding some random data after the 5 bytes do exactly what he wants? He wouldn't store the random data as well as the hash, surely?
He could add some other redundancy check data to the data to be hashed as well, if required.

It's still a bit crazy generating a hash that is bigger than the data it is over though.

Share this post


Link to post
Share on other sites
iMalc: If you append random data to the X bytes, you'll change the hash value. Unless you also store/transmit the random data, you'll have no way to verify that the hash is correct. If you do store/transmit the random data, the purpose of adding the random data has been eliminated.

One possibility would be to use some kind of signature algorithm instead of hashing - the X bytes of data is used as an encryption key to encrypt data with a specific property (such as a psuedorandom sequence that has a particular weak hash value). If the server knows the X bytes of data, it could decrypt the psuedorandom sequence and ensure that the X bytes of data has that particular property.
The above is basically an adaptation of the HashCash idea.

Another possibility that would be suitable for cases where only one end has the 'magic bytes' would be to use some kind of public/private key encryption. Using only 128 bits (16 bytes) would allow a relatively strong key, and the public key could be given to anybody. That person could verify that another person has the private key by encrypting random data, having the other person decrypt it and send back a hash. Since the data is (psuedorandom), only the person with the correct private key could calculate the proper hash of the decrypted data.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Extrarius
iMalc: If you append random data to the X bytes, you'll change the hash value. Unless you also store/transmit the random data, you'll have no way to verify that the hash is correct. If you do store/transmit the random data, the purpose of adding the random data has been eliminated.


No, the purpose of salting is to prevent dictionary attacks. Someone might, for example, have generated hashes for all words in the dictionary, and stored in a database for future usage. Even by adding the salt "HELLO THIS IS MY SALT SO SUCK IT HACKER" to every data before hashing (and of course when verifying) would completely break the dictionary attack - who would have a pre-generated hash dictionary salted with exactly that string?

Now, let's have a real world example of where using a salt, any salt, would have helped.

Step 1:

$ echo -n trustno1 | md5sum
5fcfd41e547a12215b173ff47fdd3739

Step 2:
http://www.google.com/search?client=opera&rls=en&q=5fcfd41e547a12215b173ff47fdd3739&sourceid=opera&ie=utf-8&oe=utf-8

Are we convinced yet?

Share this post


Link to post
Share on other sites
You guys are speaking slightly different languages here.

Quote:
Original post by Anonymous Poster
Even by adding the salt "HELLO THIS IS MY SALT SO SUCK IT HACKER" to every data before hashing (and of course when verifying) would completely break the dictionary attack


(emphasis mine) This is not the same thing as as random noise. You have described adding a fixed salt. This would help defeat snooping if the ends knew the fixed salt and the snooper did not. However is the ends did not both know the salt then they would end up hashing differently. It's this latter issue that other people are talking about.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
[...]completely break the dictionary attack - who would have a pre-generated hash dictionary salted with exactly that string?[...]
Who would need to? My trusty disassembler will show me the exact string you use, and I can even place a breakpoint and let your code calculate the value for me.
Your system is somewhat useful if both ends are trusted, but if anybody else ever gets the executable (or data file with the fixed salt), it's broken. If one end isn't trusted, as is generally the case with client-server systems, it doesn't work at all.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
The best option is of course using random dalt and storing/transmitting that along with the hash, if that's an option. The worst option is using no salt at all, because you can then use a generic hash dictionary to attack the system.

Your "trusty disassembler" will show a fixed hash, yes. You still have to generate a specific dictionary for that app before you can initiate a dictionary attack. This is of course not optimal, but it's better than no salt at all.

Share this post


Link to post
Share on other sites
If you're storing hard-coded data into an executable AND storing data on the server (the hash to verify against), why not just store a unique identifier inside the client and have a database that maps from the identifier to the value? In order for a client to get the information, he/she would have to gain access to the server beyond what is intentionally allowed. This technique is as secure as it gets for associating static data with a client without requiring the user to do special things to use your program (such as giving you a hardware hash to get an 'activation code' that works only for that exact hardware).

You could also use such a technique for dynamic data, and it is the way that 'sessions' work in PHP by default. In a stateless protocol such as HTTP, you need to take certain precautions such as storing a timestamp and connection information (ip, etc) to ensure identities are not 'stolen'.

Of course, we still don't have any idea what this would be used for, so we can't know what the proper solution for this situation is. I was operating under the idea that the data would be dynamic and not hard-coded, so the cleint would have to know how to deal with the information.

Share this post


Link to post
Share on other sites
This is becoming an interesting discussion! Thanks for lots of helpful information.

I'm in the process of developing a solution to the problem of trackers exchanging copyrighted data in the BitTorrent system. The tracker has been exchanged with a single 'Central Node', which connects peers to each other in a certain way. Peers then exchange the data that a tracker would normally exchange.

Torrent IDs are 10 byte strings; the two parts of the .torrent 'info-hash' XOR-ed with each other. Now, peers need to ask for a 'TR' (Tracker Request) message to the Central Node, in which they say 'Hey other peers, I'm downloading this torrent'. They can then bring this TR message into the network. It needs permission from the Central Node to prevent irrational peers from flooding the network with bollox data. On the other hand, the Central Node should be completely unaware of what the peer is downloading information for; hence it shouldn't be able to get the torrent ID from the information that has been sent.

What I have now is the following:
Peer requests Central Node for TR message.

1. Peer chooses a random 10-bit XOR_BLOCK
2. Peer calculates residue code:
R[i] = f[i](XOR_BLOCK)
residue = HASH(R)[0..79] XOR HASH(R)[80..159]
3. Peer sends residue to Central Node
4. If Central Node gives permission, it adds some information and signs the message
5. Peer receives message from CN
6. Peer brings it into the network along with the XOR-Block and Torrent ID

Peers can validate the {XOR-Block, Torrent ID, Residue} by simply recalculating the Residue and comparing it to the one that is signed in the message.

Peers may change the XOR-Block and the Torrent ID, but it can be proven that the function f makes sure that the chance that a {Torrent ID, XOR-Block} pair exists that match the residue as signed by the Central Node is incredibly small.

The Central Node has a set of possible Torrent IDs, but it only has the Residue. Without function f, someone could simply try to hash all torrent IDs and look if one of them matches the residue. But solving function f from a hashed residue is presumed to be too hard. Unfortunatly though, I haven't been able to prove it.

Share this post


Link to post
Share on other sites
I don't understand what you're saying at all.

Do you mean 10-byte "XOR_BLOCK"?
Does "chooses a random <length> VARIABLE" mean to generate a random-as-possible bitstring of the specified length?
What does the syntax X[y] indicate?
What does the syntax X[y](z) indicate?
How is the residue used if it's essentially just random data?
If it isn't, then in what way is it not?
Couldn't the server just pretend to be a peer (in addition to serving) and get all the information you're trying to keep from it?

Finally, if my decyphering is correct:
You want to be able to create an ID from a file that clients can use to uniquely identify a file. You want to be able to sign the ID without the signer knowing the ID and without those that know the ID being able to create the signature. Yhose that have the ID should be able to verify that a message was signed correctly and has not been altered since being signed.
Is that correct?

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement