Sign in to follow this  

hashing function

This topic is 3591 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

is this a good function for hashing 32 bit numbers - is it reversable? :>

int salt = 123123512;

int hash(int i){
    return wrapShift(i, i&31) ^ wrapShift(salt, i&31) ^ i;
}
	
int wrapShift(int i, int amount){
    return (i<<amount) | i>>>(32-amount);
}

[Edited by - gareth0 on February 16, 2008 4:01:44 AM]

Share this post


Link to post
Share on other sites
You can XOR before shifting, since the two things that you XOR are shifted the same direction/amount:


wrapShift((i ^ salt), i & 31) ^ i;


Is "reversible" one of your criteria for "good"? Normally it's a criteria for "not good", for security reasons. But there are other things that matter quite a bit, depending on what you are doing with the hashed values. So, what are you doing with them?

And why do you feel the need to write your own hash function, anyway?

Share this post


Link to post
Share on other sites
Quote:
Original post by gareth0
is this a good function for hashing 32 bit numbers - is it reversable?

:>

int salt = 123123512;

function hash(int val){
int h = wrapShift(i, i&31) ^ wrapShift(salt, i&31) ^ i;
}

int wrapShift(int i, int amount){
return (i<<amount) | i>>(32-amount);
}
You're passing in val and using an undeclared variable i. I assmue you mean them to be the same thing. Actually it looks like a cross between C and pascal to me.

No it is not reversible. Hashes are not supposed to be reversible anyway as in hashing you generally end up with fewer bits than you started with.
It looks like you're intending to hash a 32-bit number to a 32-bit number. So it's not really hashing you're wanting anyway. You seem to be wanting to implement encryption. This explains why you were wondering if it was reversible.
In any case, it is a relatively poor hash, and a broken encryption algorithm seeing as how it isn't reversible.

Share this post


Link to post
Share on other sites
um some stupid typos there, sorry.

the idea is to hash ip address (32bit ones) for storing visitors to a website - so its not meant to be reversable ;)

and i felt the need to write it myself because i enjoy such crazyness.

thanks for the input :S

Share this post


Link to post
Share on other sites
Okay thanks for explaining. I'm guessing you are using the hash as part of an unordered_map or hash_map? Then you just want to make sure that addresses like 172.16.0.123 don't hash to the same value as other addresses ending in .123 for when the mask-value / number-of-buckets is less than 256, for example. It could become even more important if the bytes order is the other way around!
So, you really are hashing to a smaller number of bits in this case because the hash table will typically take only some subset of bits starting from the low-end, thanks to its masking.

Then what you've done makes sense, and what Zahlman simplified your code to should be fine. Not necessarily ideal, but probably good enough.

It helps greatly to explain what I've just guessed, up front.

Share this post


Link to post
Share on other sites
One property of a good hash function is that it more or less evenly distributes the input values over the output range. The more "uneven" the distribution, the more collisions (which is what you don't want).
Without running any of the more complicated statistical tests, you can rule out a bad hash rather quickly if you calculate an "infinite" (large) number of hashes, count the incidence of every hash value, and make a plot.

I've done this for the lower 16 bits of your hash function (lacking the RAM to hold a 2^32 array of doublewords in memory, and also OpenOffice won't do more than 65535 values anyway) using all numbers from 0 to INT_MAX as input.
Having used every possible input value, every hash value should ideally occur more or less the same (65536 times) without any big "spikes", "clusters", "waves", or other clearly visible patterns.

The graph looks like this:

(the full graph is much larger of course, but it has the same repeating pattern over and over again)

Share this post


Link to post
Share on other sites

This topic is 3591 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this