Jump to content

  • Log In with Google      Sign In   
  • Create Account


#Actuale‍dd

Posted 09 December 2012 - 08:45 AM

Keys are only ever numbers and strings, right?

Those are the most common types, but they're not the only ones. Could be anything, really. You might want to use a 2D coordinate as a key, for example.

Could you clarify what you mean by a "traits mechanism"? I'm unfamiliar with the concept. Perhaps a simple example would clear things up. I do indeed need an overloaded operator==(), only to compare the key held inside the data object though.

Sure. You have a template class like this:

template<typename Key>
struct HashTraits; // only declared, no body for the primary template.

It has no definition, but you can specialize it for types that need to work with the hash table:

template<>
struct HashTraits<std::string>
{
    static std::size_t Hash(const std::string &amp;s) { /* ... */ }
    static bool Equal(const std::string &amp;s1, const std::string &amp;s2) { /* ... */ }
};

template<>
struct HashTraits<int>
{
    static std::size_t Hash(int i) { /* ... */ }
    static bool Equal(int i1, int i2) { /* ... */ }
};

// and so on...
EDIT: changed use of uintptr_t to std::size_t.

Then, if the hash table needs to hash an object it does so by calling "HashTraits<Key>::Hash(key)". A similar thing is done for equality comparisons when searching buckets for matching keys. If there's no HashTraits specialization for the Key type available you get a compiler error, indicating that you need to write one. There are some other options in the design space, here. You might have two separate traits templates, one for the hash and one for equality. Or you might forgo the equality part altogether and just rely on the existence of an operator==.

Now, rather than using traits, you could also simply have the requirement that the user must ensure that there's a Hash() function that takes a Key and returns a hash value, rather like you have the operator< requirement for your SortedList. However, some care must be taken as e.g. Hash(const std::string &amp;) might get called even if the Key type is "const char *", due to C++ overload resolution. This may not be the desired behaviour. Using the traits stuff avoids this problem because there has to be a template specialization for the exact type.

So there's a tradeoff between simplicity and correctness.


Do you want to expose the nodes? Or just the values they contain? By returning Node objects, the user can "accidentally" modify any next/prev pointers.

The only reason I can think of to provide my own node type would be for "intrusive links", but your SortedList doesn't seem to allow that (which is fine). Is the Node template parameter necessary?

I can't seem to decide, because I don't have enough experience to know if I would ever need to expose the node or not. When would exposing the node's data object be an issue in game programming? I definitely don't want the 'next' pointer to be handled by anything but the data structure it is serving (i.e. list).

At times, it can be useful to expose the node type (for example, when you've got an intrusive list, as I mentioned).

If you don't have an explicit use-case in mind for the Node, I would hide it for the time being, if possible. As with the protected/private stuff, it's easier to expose stuff at a later date than it is to hide it, given that people may have started to depend on it.


int Size() const;
Is int the best type to use here?

Should I be using something less expensive than that? Short int? Or something with higher resolution?

No, int should be plenty fast. But why not unsigned int? Or std::size_t?

Since hash tables apparently can get pretty crazy in size, I thought int would an okay data type for size, but please explain why you think otherwise. My brain is a sponge right now.

Indeed, any container can get large, depending on what you put in it :) So you want to choose a type that can index the largest possible array you can fit in memory. In other words, you really want an integral type that's the same size as a pointer. On pretty much any modern system in the world, std::size_t will fit the bill.


/** Getter function.
* Returns a ptr to the object stored in the node.
*/
T* Data() const { return &amp;m_data; }
IMHO, this should return a reference. In situations where it's possible for there not to be a value, I can understand the choice to return a pointer. But that's not the case here.

Wouldn't I have to feed the function an external variable then? I can't think of any other way to avoid making the function more complicated. Is the extra parameter just a necessary evil?

I Think we're getting our wires crossed. All I'm really suggesting is that the function be changed to "const T &amp;Data() const { return m_data; }" or "T &amp;Data() { return m_data; }". Actually, now that I look at it again, I'm surprised that the original function even compiles (as there should be a 'const' before the 'T *'). Is this Data() function used anywhere, at present?

Is making ListNode uncopyable just a 'correctness' thing like const, or is there some problem that can arise from it being copyable?

Yes :)
What does it mean to copy a node? If the answer is "that's nonsense", then it would be best to statically disallow it. It's better to be notified of attempted nonsense at compile time than at run time.

by explicit constructors you mean things like the copy and assignment constructors?

No, there's a C++ keyword, "explicit".

At present, I can do weird stuff like this:
void foo(const ChainedHashTable<int, int> &amp;tab); // suppose this function exists

// ...

foo(42); // compiler allows this, as ChainedHashTable's constructor is not explicit!

This is the same mechanism that allows you to pass a "const char *" to a function that takes an std::string. For your constructor, the automatic conversion is probably not wanted!

11 was some arbitrary prime number I chose to start the hash table with.

I see. Presumably the number of buckets is going to change during the life of a hash table, either automatically, or at the user's request(?).


Data* LookUp( Key key ) const;
Should the return type be const-qualified?

I took the const after the function declaration to be a promise to the programmer that the the function will not modify the members of the class...So the fact that I'm returning a non-const pointer to the Data, makes it a non-const function?

Well, let me put is this way. You hand me a hash table, which you say is const. Should I be able to change its content (without resorting to nasty hacks)?

The idiomatic answer for C++ is no, I should not be able to change its content. Sure, the contents of the hash table aren't direct members of the structure, but that sounds like you're just trying to weasel your way out of keeping things const-correct ;)

What often happens is that two overloads are provided for things like this:
const Data* LookUp( Key key ) const;
Data* LookUp( Key key );


private:
std::vector *m_pTable;
Perhaps a typo, but do you really want a pointer to a vector, here?

I guess the general case is to just instantiate the object inside the class.

Yeah, using a pointer is just going to require more book-keeping. Note that if you implement a copy constructor and assignment operator for SortedList, you get them for free if the vector is held by value, rather than by pointer.

EDIT: actually, I now realise a custom assignment operator will still be required if you want to provide the strong exception safety guarantee.


void Resolve( int index ) const;

It's supposed to be my collision resolution function. My plan was to have the function take in the index value where the collision occurred, and then insert the new object being allocated into the list existing at that index.

Up until this point you've got something approaching a very general hash table. So it's a bit weird to have something this specific here, IMHO. Is there any way you can move this specific function outside of the hash table?

I understand that I have much to learn about proper and conventional coding,

That never ends :)

so I'm really glad that you devoted some of your time to help me out. Posted Image

No problem, keep plugging away!

#1e‍dd

Posted 09 December 2012 - 12:13 AM

Keys are only ever numbers and strings, right?

Those are the most common types, but they're not the only ones. Could be anything, really. You might want to use a 2D coordinate as a key, for example.

Could you clarify what you mean by a "traits mechanism"? I'm unfamiliar with the concept. Perhaps a simple example would clear things up. I do indeed need an overloaded operator==(), only to compare the key held inside the data object though.

Sure. You have a template class like this:

template<typename Key>
struct HashTraits; // only declared, no body for the primary template.

It has no definition, but you can specialize it for types that need to work with the hash table:

template<>
struct HashTraits<std::string>
{
    static uintptr_t Hash(const std::string &s) { /* ... */ }
    static bool Equal(const std::string &s1, const std::string &s2) { /* ... */ }
};

template<>
struct HashTraits<int>
{
    static uintptr_t Hash(int i) { /* ... */ }
    static bool Equal(int i1, int i2) { /* ... */ }
};

// and so on...

Then, if the hash table needs to hash an object it does so by calling "HashTraits<Key>::Hash(key)". A similar thing is done for equality comparisons when searching buckets for matching keys. If there's no HashTraits specialization for the Key type available you get a compiler error, indicating that you need to write one. There are some other options in the design space, here. You might have two separate traits templates, one for the hash and one for equality. Or you might forgo the equality part altogether and just rely on the existence of an operator==.

Now, rather than using traits, you could also simply have the requirement that the user must ensure that there's a Hash() function that takes a Key and returns a hash value, rather like you have the operator< requirement for your SortedList. However, some care must be taken as e.g. Hash(const std::string &) might get called even if the Key type is "const char *", due to C++ overload resolution. This may not be the desired behaviour. Using the traits stuff avoids this problem because there has to be a template specialization for the exact type.

So there's a tradeoff between simplicity and correctness.


Do you want to expose the nodes? Or just the values they contain? By returning Node objects, the user can "accidentally" modify any next/prev pointers.

The only reason I can think of to provide my own node type would be for "intrusive links", but your SortedList doesn't seem to allow that (which is fine). Is the Node template parameter necessary?

I can't seem to decide, because I don't have enough experience to know if I would ever need to expose the node or not. When would exposing the node's data object be an issue in game programming? I definitely don't want the 'next' pointer to be handled by anything but the data structure it is serving (i.e. list).

At times, it can be useful to expose the node type (for example, when you've got an intrusive list, as I mentioned).

If you don't have an explicit use-case in mind for the Node, I would hide it for the time being, if possible. As with the protected/private stuff, it's easier to expose stuff at a later date than it is to hide it, given that people may have started to depend on it.


int Size() const;
Is int the best type to use here?

Should I be using something less expensive than that? Short int? Or something with higher resolution?

No, int should be plenty fast. But why not unsigned int? Or std::size_t?

Since hash tables apparently can get pretty crazy in size, I thought int would an okay data type for size, but please explain why you think otherwise. My brain is a sponge right now.

Indeed, any container can get large, depending on what you put in it :) So you want to choose a type that can index the largest possible array you can fit in memory. In other words, you really want an integral type that's the same size as a pointer. On pretty much any modern system in the world, std::size_t will fit the bill.


/** Getter function.
* Returns a ptr to the object stored in the node.
*/
T* Data() const { return &m_data; }
IMHO, this should return a reference. In situations where it's possible for there not to be a value, I can understand the choice to return a pointer. But that's not the case here.

Wouldn't I have to feed the function an external variable then? I can't think of any other way to avoid making the function more complicated. Is the extra parameter just a necessary evil?

I Think we're getting our wires crossed. All I'm really suggesting is that the function be changed to "const T &Data() const { return m_data; }" or "T &Data() { return m_data; }". Actually, now that I look at it again, I'm surprised that the original function even compiles (as there should be a 'const' before the 'T *'). Is this Data() function used anywhere, at present?

Is making ListNode uncopyable just a 'correctness' thing like const, or is there some problem that can arise from it being copyable?

Yes :)
What does it mean to copy a node? If the answer is "that's nonsense", then it would be best to statically disallow it. It's better to be notified of attempted nonsense at compile time than at run time.

by explicit constructors you mean things like the copy and assignment constructors?

No, there's a C++ keyword, "explicit".

At present, I can do weird stuff like this:
void foo(const ChainedHashTable<int, int> &tab); // suppose this function exists

// ...

foo(42); // compiler allows this, as ChainedHashTable's constructor is not explicit!

This is the same mechanism that allows you to pass a "const char *" to a function that takes an std::string. For your constructor, the automatic conversion is probably not wanted!

11 was some arbitrary prime number I chose to start the hash table with.

I see. Presumably the number of buckets going to change during the life of a hash table, either automatically, or at the user's request(?).


Data* LookUp( Key key ) const;
Should the return type be const-qualified?

I took the const after the function declaration to be a promise to the programmer that the the function will not modify the members of the class...So the fact that I'm returning a non-const pointer to the Data, makes it a non-const function?

Well, let me put is this way. You hand me a hash table, which you say is const. Should I be able to change its content (without resorting to nasty hacks)?

The idiomatic answer for C++ is no, I should not be able to change its content. Sure, the contents of the hash table aren't direct members of the structure, but that sounds like you're just trying to weasel your way out of keeping things const-correct ;)

What often happens is that two overloads are provided for things like this:
const Data* LookUp( Key key ) const;
Data* LookUp( Key key );


private:
std::vector *m_pTable;
Perhaps a typo, but do you really want a pointer to a vector, here?

I guess the general case is to just instantiate the object inside the class.

Yeah, using a pointer is just going to require more book-keeping. Note that if you implement a copy constructor and assignment operator for SortedList, you get them for free if the vector is held by value, rather than by pointer.


void Resolve( int index ) const;

It's supposed to be my collision resolution function. My plan was to have the function take in the index value where the collision occurred, and then insert the new object being allocated into the list existing at that index.

Up until this point you've got something approaching a very general hash table. So it's a bit weird to have something this specific here, IMHO. Is there any way you can move this specific function outside of the hash table?

I understand that I have much to learn about proper and conventional coding,

That never ends :)

so I'm really glad that you devoted some of your time to help me out. Posted Image

No problem, keep plugging away!

PARTNERS