Jump to content
  • Advertisement
Sign in to follow this  
CDProp

Implementing a hash table, the right way (C++ templates)

This topic is 3666 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

What do you think would be the best way to do this, out of the following choices? Ok, so assume your class starts out somewhat like this:
template <typename T>
class HashEntry
{
public:
	bool		m_bEmpty;	// Is this node empty?
	char		m_cKey[8];	// String key, for comparison.
	T		*m_pData;	// Pointer to actual data.
};

template <typename T, int SIZE>
class CHash
{
private:
	HashEntry<T>	*m_pData;	// = new HashEntry<T>[SIZE]
	int		m_iTableSize,	// Size of the whole table.
			m_iCount;	// Current count.
public:
	CHash();
	~CHash();
};


Ok, so I'm thinking that the constructor would handle the allocation of the HashEntries, and then assign initial values to them all (m_bEmpty = true, etc.). I'm wondering what you guys would do concerning the adding of data. I figure that the Add() method can: 1. Accept a T*, and set the HashEntry<T>::m_pData pointer directly to that address. The user of this hash table will have to make sure that the object being referenced by the pointer passed into Add() persists until they explicitly remove the reference from the table using the Remove() method, otherwise the pointer in the table will become invalid. For this reason, I don't like this option. 2. Accept a T&. It's still passed by reference, which is nice. Add() will allocate a new T object on the heap, and use the address of that object for the HashEntry<T>::m_pData pointer. The Remove() method will delete the T object. I like this, because it ensures that the pointers in the hash table are always valid, but it means that the user of the class has to create a T object from the calling code, and pass it into Add() by reference, only to have Add() copy it into a new object. Overall, I think I like #2 a lot more, but is there a better way? Anything I overlooked?

Share this post


Link to post
Share on other sites
Advertisement
Quote:
Original post by CDProp
1. Accept a T*, and set the HashEntry<T>::m_pData pointer directly to that address. The user of this hash table will have to make sure that the object being referenced by the pointer passed into Add() persists until they explicitly remove the reference from the table using the Remove() method, otherwise the pointer in the table will become invalid. For this reason, I don't like this option.

2. Accept a T&. It's still passed by reference, which is nice. Add() will allocate a new T object on the heap, and use the address of that object for the HashEntry<T>::m_pData pointer. The Remove() method will delete the T object. I like this, because it ensures that the pointers in the hash table are always valid, but it means that the user of the class has to create a T object from the calling code, and pass it into Add() by reference, only to have Add() copy it into a new object.

Do you wonder why the designers of the standard library chose to implement copy semantics for their containers?

Here are some other strategies to consider.

o Implement copy semantics. That way (a) you don't have to worry about ownership, and (2) you are free to imeplement CHash<C*, 100> as well as CHash<C, 100> or CHash<smart_ptr<C>, 100> as appropriate.

o Implement smart pointer semantics. You won't have to worry about pointer ownership. You will be limited to storing smart pointers in your container, but right now you're limited to storing dumb pointers and paying for all the problems that entails.

Share this post


Link to post
Share on other sites
I'm not 100% sure what you mean by implementing copy semantics. Are you referring to a copy constructor?

Share this post


Link to post
Share on other sites
Copy semantics mean that when you "put something into" the hash, you actually store a copy of the thing in the hash. The object (instance of T) is responsible for being able to provide a copy (i.e. defining a copy constructor if needed). But the Add()ed instance is not the same one as the one passed in; it's merely equal to it.

BTW, use std::string to represent strings (such as the key), and consider whether a tree-map - such as already provided by the standard library - is sufficient for your purposes. It could well be that you can replace your entire source box so far, and all the planned implementation work as well, by just using a std::map<std::string, T>.

Share this post


Link to post
Share on other sites
STL tr1's new unordered containers are a great choice for this. Reinventing the wheel serves no practical purpose, unless you're just trying to learn. Making your own hash map class is silly for actual production code.

Share this post


Link to post
Share on other sites
This is an off the cuff idea on how I would layout a hash table.


template<typename T,typename hashFunction>
class hash
{
std::vector< std::vector<T> > Data;
};

Share this post


Link to post
Share on other sites
Quote:
Original post by Rydinare
STL tr1's new unordered containers are a great choice for this. Reinventing the wheel serves no practical purpose, unless you're just trying to learn. Making your own hash map class is silly for actual production code.


Just trying to figure it out on my own. :)

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!