Jump to content
  • Advertisement
Sign in to follow this  
irreversible

Vertex hashing

This topic is 1010 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been doing some digging on this and the following piece of code seems to be the baseline off of which everyone is building their hash:

const unsigned int * h = (const unsigned int *)(&v);
unsigned int f = (h[0] + h[1]*11 - (h[2]*17))&0x7fffffff;     // avoid problems with +-0
return (f>>22)^(f>>12)^(f);
In short, it doesn't work and frankly I can't quite understand how it's supposed to work.

I mean, it is almost 5 AM, so I might be missing something grossly obvious, but even if the vertices are normalized, f can overflow (which would be fine, but has nothing to do with the sign bit). Which is to say that I don't even know how anding the sign bit of the hash would somehow filter out only +/- zero.

Here's my naive test that explicitly handles +/- zero. I've removed the bit mangling on the last line and changed one of the primes.
__int64 vtxhash(IN const math::vec3& v)
{
    const unsigned int* h = (const unsigned int*)(&v);
    __int64 h0 = ((h[0] & 0x7fffffff) == 0) ? 0 : h[0];
    __int64 h1 = ((h[1] & 0x7fffffff) == 0) ? 0 : h[1];
    __int64 h2 = ((h[2] & 0x7fffffff) == 0) ? 0 : h[2];
    return f = (h0 + h1 * 23 - (h2 * 17));
}
Anyone dealt with this before?

The reason I'm asking is mostly for peer review and because I'd like to make heads and tails of the reference snippet.

For reference - I'm not really dealing with millions of vertices, although no collisions would be a nice thing to have.

Share this post


Link to post
Share on other sites
Advertisement

I mean, it is almost 5 AM, so I might be missing something grossly obvious, but even if the vertices are normalized, f can overflow (which would be fine, but has nothing to do with the sign bit). Which is to say that I don't even know how anding the sign bit of the hash would somehow filter out only +/- zero.

I don't think you are missing anything. It doesn't seem to work...
#include <iostream>

int hash(float v[3]) {
    const unsigned int * h = (const unsigned int *)(&v);
    unsigned int f = (h[0] + h[1]*11 - (h[2]*17))&0x7fffffff;     // avoid problems with +-0
    return (f>>22)^(f>>12)^(f);
}

int main() {
    float v1[] = {-0, 0, -0};
    float v2[] = {0, -0, -0};
    std::cout << hash(v1) << std::endl;
    std::cout << hash(v2) << std::endl;
}
Outputs 2004002401 and 2004002389, kind of disproving the +/- 0 ignoring thing.

Share this post


Link to post
Share on other sites

It seems to work fine for me when I pass in a pointer to an array of 3 floats.

 

It doesnt work if I make the function take an array (it just produces undeterministic garbage), but I dont at the moment see why (Is it taking the address of a pointer? Never really use arrays this way ph34r.png )

 

edit: yup, the "&" needs to go...

Edited by Waterlimon

Share this post


Link to post
Share on other sites

edit: yup, the "&" needs to go...

Oh goddamn it. Yes, don't know how I missed that.

So, thinking about this more objectively, neither multiplying by a constant or adding is going to affect the bit pattern of -0.0 at all (unsigned integer wrapping will cause the same bit to be set after multiplication). This should mean that its contribution can only be to the value of that exact same bit, and the bit ensures that it has no effect on the final value.

Share this post


Link to post
Share on other sites

This should mean that its contribution can only be to the value of that exact same bit, and the bit ensures that it has no effect on the final value.

and it does, I think.

 

-0.0 == 0x80000000 as bit pattern. Also, sizeof(unsigned int) == 4 and sizeof(float) == 4. Then

unsigned int f = (h[0] + h[1]*11 - (h[2]*17)) & 0x7fffffff; // avoid problems with +-0

is equivalent to

unsigned int f = (h[0] + h[1] + h[1]*2 + h[1]*8 - h[2] - h[2]*16) & 0x7fffffff;

As you can see all 3 values get added/subtracted into f without multiplication factor one time. (With multiplication factor, bit 31 gets shifted out the 32 bit integer, and has no influence, as swiftcoder already said.) The highest bits of h[0], h[1], and h[2] thus end up in the unmasked term (eg inputs (-0.0, +0.0, +0.0), (+0.0, -0.0, +0.0), or (+0.0, +0.0, -0.0)). Masking the high bit solves that. However, it looks to me that you are effectively mapping a large (or the entire?) part of the negative space onto the positive space with this masking. Hashing results might be better if you only explicitly take out the -0.0 values instead.

Share this post


Link to post
Share on other sites


... it looks to me that you are effectively mapping a large (or the entire?) part of the negative space onto the positive space with this masking

 

Indeed, this is what I'm getting from it. That effectively makes the algorithm moot unless you temporarily translate your entire data set into the positive range. Which might not be the worst idea if you're not limited by precision. I still feel like my take on it is more robust and hardly any more expensive.

 

The bit mangling on the last line flies past me as well - I'm not really sure how it enhances the stability of the hash. I feel more confident using larger primes and mapping the hash to a 64 bit range instead.

Share this post


Link to post
Share on other sites
It all depends on what you need out of a hash function. Even with positive and negative numbers colliding, it's most likely good enough™ for most purposes.


Also, a 64 bit hash for a 96 bit vector seems a bit extreme for most uses.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!