Simple N00b C# Question

Started by
14 comments, last by Cygon 16 years, 5 months ago
I'm in need of a data structure that stores a set of objects of a given type without allowing duplicates, and supports fast insert, delete, and test for a specific item in the collection. In the general case I can't rely on things like the values of each object being unique (e.g. for a GetHashCode implementation or assigning a unique key to each object). Is there some stock way to do this? If not, I could make one easy enough if there was a way to either order raw references (to make a binary search tree; Object.ReferenceEquals is similar, but only tests for equality) or get a hash of a reference (to make a hash table). Do either of those things exist?
Advertisement
Just inherit for List<T>, and use it as a base for what you are doing. Add the checking for duplicates on add functions. Should be pretty easy.

You will need to create a compare method to check to see if they are the same, but that is pretty easy.

theTroll
Uhh... List? That's an array. In other words O(N) insert, delete, and search (without a way to compare elements the list couldn't be sorted; and if we could compare elements there are faster data structures). Not exactly fast.
Last I checked the .NET framework doesn't have any implemented trees save for the one associated with a TreeView. Even then I don't think that you can get at that outside of a TreeView control.

You'll have to implement your own tree data structure.
Sorted dictionary is a binary search tree. Though even if it wasn't, as I mentioned in the opening post, such a tree would be trivial to implement if it were possible to compare references for greater/less than (which I'm hoping one of you will know how to do). There's also Dictionary (hash table), but that requires the ability to get reliable hashes for an object based just on its reference (also mentioned, and also not something I know how to do).
Quote:Original post by Catafriggm
Uhh... List? That's an array. In other words O(N) insert, delete, and search (without a way to compare elements the list couldn't be sorted; and if we could compare elements there are faster data structures). Not exactly fast.


Well if you know the answer then why are you asking us mere mortals?

theTroll

If you were using C++, I'd have recommended std::set<>. Most implementations of it use a Red-Black tree and it doesn't allow duplicates.

In .NET you could misuse one of the provided associative collections (Hashtable or Dictionary<>) with a dummy as the value part. It's not optimal, but I'd expect it to be reasonably fast.

Another option would be to use one of the classes from the PowerCollections project. The library is free and fills many of the gaps left in the collection classes and other data structures from the .NET framework. I think there's a hash table based 'Set' class and a 'RedBlack' class implementing a Red-Black tree.

-Markus-
Professional C++ and .NET developer trying to break into indie game development.
Follow my progress: http://blog.nuclex-games.com/ or Twitter - Topics: Ogre3D, Blender, game architecture tips & code snippets.
Quote:Original post by TheTroll
Well if you know the answer then why are you asking us mere mortals?


What I don't know:
- Is there a way to do <, > on references (rather than the contents of the objects themselves)? The <, > don't need to have any particular meaning other than that if A < B at one point, A < B at some later point, after various members of both have changed.
- Does GetHashCode (or some other hash provider) return something that can reliably be used in a hash table, and is based on the reference itself, rather than the contents of the item? E.g. garbage collection will move things around; will the hash code change on GC? Is the hash code more or less evenly distributed among values for raw references? Will the hash code change if the members in the object referenced change?

What I do know:
- Dictionary is fast (O(1)). But I would need the ability to generate a hash code for a raw reference.
- SortedDictionary is also fast (O(log N)). But I would need a way to do <, > for a raw reference.

Cygon: Ehhh. I suppose if I can't find anything better I could assign each entry in the list a random number to use as the hash key. It would be ugly, but I suppose it would work. Thx for the info.
ICompare if what you might be looking for. It give you the ability to make one object equal to, less then or greater then another object. It will give you the ability to sort the objects also.

Now might the comparisons be a bit arbitrary? Yes, but it goes give you are basis to understand where things should be.

Now what I don't understand is why you can't come up with a unique hash. In your first post you said that the object you are storing needs to make sure it is not a duplicate, so if it is not the same then you should be able to make unique hashes.

theTroll

Quote:Original post by TheTroll
Now what I don't understand is why you can't come up with a unique hash. In your first post you said that the object you are storing needs to make sure it is not a duplicate, so if it is not the same then you should be able to make unique hashes.


The values of the object will be changing. A hash code of an object suddenly changing while it's in the table would break any hash table implementation. Besides that, "duplicate", in this case, is referring to multiple references to the same instance of a class. There is no data in the class itself that is unique to one instance compared to any other. These are the same reasons why I can't simply compare them member-wise.

This topic is closed to new replies.

Advertisement