Sign in to follow this  

Hashtable vs. AVL Tree?

This topic is 1119 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I have an example of a program that is outperforming mine that does a very similar task, and it seems it's doing that because of using an "AVL Tree" (he calls it Hash Tree) and that seems like it is beating my text book hash table implementation.

 

https://en.wikipedia.org/wiki/AVL_tree

 

Does any one have experience as to why it would be happening? Aren't hash tables the fastest way to retrieve items based on a hash?

 

In fact, this HashTree also uses the hash code to lookup.

Share this post


Link to post
Share on other sites
A true AVL tree does not use hashing. I have no idea what this hybrid data structure is doing, so it's hard to say why it's faster. My first guess would be that your custom hastable implementation has some non-obvious inefficiencies.

Share this post


Link to post
Share on other sites

Does any one have experience as to why it would be happening? Aren't hash tables the fastest way to retrieve items based on a hash?


You say you have a "text book hash table" but not which one. There are several ways to implement a hash table. Many of the "text book" varieties are non-optimal.

There's also the quality of the hash function to consider. A bad hash function can essentially turn a hash table into a linked list, no matter how good the hash table itself is. A bad hash function can also be prohibitively expensive to the point that the tree lookup times are faster than computing the hash. The hash table implementation may also be unnecessarily requiring the hash to be recomputed.

There's then also the question of the use of the data structure. You didn't specify what your task or this program actually do. A hash table may simply be the wrong choice of data structure for the task. In some cases a sorted array or heap can be significantly faster, since they can change your algorithm from needing multiple lookups into the table for each item into an algorithm using just one single linear scan of the data. For most general uses a tree will perform worse, but for particular access patterns they might perform better (assuming it's allocated in a contiguous chunk of memory in iteration order).

Share this post


Link to post
Share on other sites

You need to profile both programs to work out what's going on. For example maybe one is getting lots more cache misses than the other.

 

If you want help with working out what's going on, it's very difficult to guess without seeing any code. Ideally a compilable benchmark.

 

Have you tried comparing your hash table's performance to the standard implementation that comes with whatever language you're using? For C++ that would be std::unordered_map

Share this post


Link to post
Share on other sites

This topic is 1119 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this