Sign in to follow this  
Vanderry

[C++] Pair as hash_map key

Recommended Posts

Hey guys !

I have this problem where I want to sort the edges in a 3D mesh according to their vertex pair or bordering face indices, if that makes any sense. The point is that I iterate through the mesh one face at a time and establish their 3 edges while doing so. This does of course mean that many edges would have at least one duplicate, unless I'm able to quickly look up if that edge already has been established.

For a moment I thought I could use a stdext::hash_map with a std::pair as the key value, but it seems that was trickier than I anticipated. I got an error:

Error 8 error C2440: 'type cast' : cannot convert from 'const std::pair<_Ty1,_Ty2>' to 'size_t'

And I solved it temporarily by implementing the following hash function:

struct hash_dword_pair : public hash_compare< pair< DWORD, DWORD > >
{
const size_t operator() ( const pair< DWORD, DWORD > &p ) const
{
return p.first;
}
bool operator() ( const pair< DWORD, DWORD > &a, const pair< DWORD, DWORD > &b) const
{
return a.first < b.first && a.second < b.second;
}
};


I'm not sure I did it right, and it seems like some edges aren't properly stored or retrieved from the hash_map. So I guess my real question is if there is any other container better suited for my problem or if I'm just doing it wrong.

Ps: I'm using pretty low-res meshes, and only sorting faces that point upward, so it's no big performance issue.

Many thanks for any help !

- Dave

Share this post


Link to post
Share on other sites
If you want to reduce hash collisions you should probably use both values in the hash value. Also, your comparison function probably doesn't do what you want. If a.first and b.first are equal return a.second < b.second, but if a.first and b.first aren't equal return a.first < b.first.

Share this post


Link to post
Share on other sites
Quote:
Original post by no such user
If you want to reduce hash collisions you should probably use both values in the hash value.


The hash map is structured like this:

stdext::hash_map< pair< DWORD, DWORD >, CEdge*, hash_dword_pair >

If that's what you mean by using both values as hash value. Or do you mean that I should modify the hash_dword_pair struct somehow? I'm not 100% sure how to write my own hash functions like this so could you give me a little more specific hint please? :)

Quote:
Original post by no such user
Also, your comparison function probably doesn't do what you want. If a.first and b.first are equal return a.second < b.second, but if a.first and b.first aren't equal return a.first < b.first.


Thank you, that clearly made a positive difference in the map.

Share this post


Link to post
Share on other sites
The operator() function of your hash_compare class is what returns the hash value of your pair. Currently you only return the value of the first value. This means that (1,1) hashes to the same value as (1,2) and (1,3) and so on. This will probably lead to a lot of collisions unless you know that the first value of your pairs are unique. Since that's generally not the case you should write a function that combines the two values in your pair. How to best combine these two values to a single hash depends on what distribution you expect these numbers to have. If you expect them to always be small positive integers then (first << 16 + second) could be a good hash. If you expect the first to be usually in the range -100 to 100 and the second in the range 6000 to 7000 then just adding them together would probably work out.

Share this post


Link to post
Share on other sites
Okay, I suspected that it was a bit to easy to just expect the hash_map class to figure out how to hash pairs. I have been wrapping my mind about how to form unique key values from pairs. Just to keep the options open, I expect to need the full capacity of a DWORD. Is it possible to combine them into a greater type of integer or do I have to stay within the DWORD range to achieve this? Thanks so much for taking the time to explain :)

Share this post


Link to post
Share on other sites
As you may have noticed, the return value for the hash_compare function is a size_t. This is 32 bits for a 32 bit application and 64 bits for a 64 bit application. Since DWORD is 32 bits, for a 64 bit application you can just combine the two into a single value with no loss of information. For a 32 bit application you need to combine them somehow. If you genuinely have independent evenly distributed numbers across the entire DWORD range then you can just xor the two values together. However, situations where that is actually true are pretty rare in practice. For the case when the two numbers are somewhat correlated with similar distributions (first ^ _rotl(second, 16)) is a fairly good relatively cheap hash function.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this