Sign in to follow this  
Narf the Mouse

Idea for a fast BinaryDictionary

Recommended Posts

I'm operating on four hours sleep, so I can't tell how good an idea this is, but: One way of storing objects (for an arbitary example, players, enemies and bullets in a FPS - Let's say it's semi-realistic, so bullets aren't instant) is a dictionary. The problem with a standard dictionary, such as that used in .Net, though, is that they use arrays - And arrays are very slow when deleting objects. Such as bullets, which are created and destroyed very fast.

So, you have a BinaryDictionary<Key, Value>. The Key is really an Int32 hash code (through .Net's ubiquitus .GetHashCode() function. When entering the top of the dictionary, we do a bitwise compare. If the first bit is 0, it goes left. If it's 011, it goes right. The next object has a hash code that ends in the bits 001. It goes right, then left, because there's something in the right register.
If we want to look up a key, we just follow the bits. If we look up the key 001, we'll find the second object at Right-Left.
Now, add a Key that ends in the bits 101. It goes right, finds it occupied, goes left, occupied, goes right, empty. We now have objects at Right, Right-Left and Right-Left-Right.

Say we want to delete Key 001. We take the objects at Right-Left-Right and Right. The connecting link is Right-Left; we then need to remove it. Right-Left-Left is empty; if it wasn't, it would be the new Right-Left. So the new Right-Left is Right-Left-Right.

Do I have something here? I think I might, because it sounds like simply a standard Binary Tree that uses hash codes to determine left or right. And simple is good.

Share this post


Link to post
Share on other sites
It sounds interesting, but say if you were to delete a value with key 101, and then access the data using key 101, the data would have been replaced. So any access to this value using this key would return the wrong value.

( if I interpretted what you wrote correctly )

Also, simple IS good. :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Pthalicus
It sounds interesting, but say if you were to delete a value with key 101, and then access the data using key 101, the data would have been replaced. So any access to this value using this key would return the wrong value.

( if I interpretted what you wrote correctly )

Also, simple IS good. :)

That is a problem.

Alternately, the Leaf could have an IsNull boolean. Removing would simply mark "IsNull = true". This has the disadvantage that, without dedicated pruning, it'll never shrink.

Which might be solved by parent Leafs keeping a count of the number of child Leafs and the number which have been marked Null. If it's marked Null and the two are equal, it gets removed.

But first, to test the .Add(Key, Value) speed.

Share this post


Link to post
Share on other sites
A simple way to significantly speed up dictionaries or other node-based data structures is use a pooled allocation system. You could allocate an array of 1000 nodes, and then all the nodes will be nearby so you'll have good cache locality, but since the elements are still nodes, you can easily delete a node by setting a pointer to null and then linking it into a free list. If you have more than 1000 nodes, you allocate another 1000 nodes and keep a list of the memory blocks you've allocated.

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
A simple way to significantly speed up dictionaries or other node-based data structures is use a pooled allocation system. You could allocate an array of 1000 nodes, and then all the nodes will be nearby so you'll have good cache locality, but since the elements are still nodes, you can easily delete a node by setting a pointer to null and then linking it into a free list. If you have more than 1000 nodes, you allocate another 1000 nodes and keep a list of the memory blocks you've allocated.

That's another thing I considered. Might ask about it in a different thread.

Share this post


Link to post
Share on other sites
Console output: (Sorry, got the order wrong the first and second time)

List<int>.Add
Loops: 1000000, Seconds: 0.0142794, Loops per Second: 70030953.6815272
Dictionary<int, int>.Add
Loops: 1000000, Seconds: 0.1295246, Loops per Second: 7720541.11728583
BinaryDictionary<int, int>.Add
Loops: 1000000, Seconds: 0.6429706, Loops per Second: 1555281.06572836
for-next List<int>
Loops: 1000000, Seconds: 0.00275989999999993, Loops per Second: 362331968.549595
foreach Dictionary<int, int>
Loops: 1000000, Seconds: 0.0223009000000001, Loops per Second: 44841239.5912271
foreach BinaryDictionary<int, int>
Loops: 1000000, Seconds: 0.00191280000000005, Loops per Second: 522793810.121275

So five times slower than a "standard" dictionary on .Add and ten times faster on foreach.







Share this post


Link to post
Share on other sites
Well, it was a neat theory, but it didn't pan out. Whatever the "standard" dictionary uses to store data, it isn't an array. At least, not a single array.


Dictionary<int, int>.Remove
Loops: 1000000, Seconds: 0.0928441999999998, Loops per Second: 10770732.0435741
BinaryDictionary<int, int>.Remove
Loops: 1000000, Seconds: 0.4911663, Loops per Second: 2035970.3017084


To the Profilermobile!...I would, but SlimTune is reporting nothing. A frequent problem.

Share this post


Link to post
Share on other sites
To the "request help mobile"!

The remove code: As shown by the comments, I've pared it back to the very minimum, getting 2.5-2.8 million removes a second, up from 2.0-2.2. Anyone got any ideas on where the major slowdowns are?

public bool Remove(int depth, ref TKey theKey, int theKeyHash)
{
if (keyHash == theKeyHash)
{
this.isNull = true;
return true;
}
else
{
if ((theKeyHash & (1 << depth)) == 0)
{
if (left == null)
return false;
return left.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (left.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (left.childCount == 0)
left = null;
--childCount;
return true;
}
return false; */
}
else
{
if (right == null)
return false;
return right.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (right.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (right.childCount == 0)
right = null;
--childCount;
return true;
}
return false; */
}
}
}

Share this post


Link to post
Share on other sites
You do realize that you just took the ~O(1) insert and removal from standard dictionary and replaced it with one that has much worse asymptotic performance? Even worse, you've destroyed the O(1) look-up, defeating the purpose of using a dictionary (hash-table) in the first place.

If you're worried about insert performance (which according to MSDN can be a O(n) operation if the Count == Capacity), simply tell the dictionary to reserve space in the constructor.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mike.Popoloski
You do realize that you just took the ~O(1) insert and removal from standard dictionary and replaced it with one that has much worse asymptotic performance? Even worse, you've destroyed the O(1) look-up, defeating the purpose of using a dictionary (hash-table) in the first place.

If you're worried about insert performance (which according to MSDN can be a O(n) operation if the Count == Capacity), simply tell the dictionary to reserve space in the constructor.

It's ten+ times faster on foreach. I've gotten it to 1/3rd as fast on remove; I can probably do the same for add. Given all that, it's useful.

Share this post


Link to post
Share on other sites
Quote:
Original post by Narf the Mouse I've pared it back to the very minimum, getting 2.5-2.8 million removes a second, up from 2.0-2.2


2e6 removals / 60 FPS = 33k removals per frame. That's a lot.

Quote:
Anyone got any ideas on where the major slowdowns are?
In algorithm that requires remove at all.

Quote:
And arrays are very slow when deleting objects. Such as bullets, which are created and destroyed very fast.
If one bullet is removed, then it's O(n). If m bullets are removed, then it can still be O(n).

See how remove_if works. It's messy code, but very simple algorithm. Due to sequential nature, it's very fast.

Quote:
other node-based data structures is use a pooled allocation system. You could allocate an array of 1000 nodes, and then all the nodes will be nearby so you'll have good cache locality, but since the elements are still nodes, you can easily delete a node by setting a pointer to null and then linking it into a free list

In C++, not in C#. In C#, the types need to be structs to achieve locality of this type. Hence the above approach.

Share this post


Link to post
Share on other sites
Current benchmarks: I suspect leaf creation is the slowdown on add. The count has been upped to 3,000,000, as longer benchmarks are less prone to numbers drift. Unfortunately for Dictionary, that wasn't kind to its foreach. Fortunately for Dictionary, most games will not have three million objects.
List<Int>.Add
Loops: 3000000, Seconds: 0.0485992, Loops per Second: 61729411.1837232
Dictionary<int, int>.Add
Loops: 3000000, Seconds: 0.5326068, Loops per Second: 5632673.10894266
BinaryDictionary<int, int>.Add
Loops: 3000000, Seconds: 2.986457, Loops per Second: 1004534.8049545

for-next List<int>
Loops: 3000000, Seconds: 0.00690839999999993, Loops per Second: 434253951.710965
foreach Dictionary<int, int>
Loops: 3000000, Seconds: 0.0677444, Loops per Second: 44284103.187865
foreach BinaryDictionary<int, int>
Loops: 3000000, Seconds: 0.00195109999999987, Loops per Second: 1537594177.64348

Dictionary<int, int>.Remove
Loops: 3000000, Seconds: 0.1969599, Loops per Second: 15231526.8234803
BinaryDictionary<int, int>.Remove
Loops: 3000000, Seconds: 0.7056726, Loops per Second: 4251263.26287857

All-in-all, not a major breakthrough, but if you want a dictionary you can foreach through ridiculously fast (1.5 billion entries per second!), consider a BinaryDictionary.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
In C#, the types need to be structs to achieve locality of this type. Hence the above approach.


Are you sure about that? The C# allocator is basically a simple pointer increment. Barring the nodes getting moved around by the GC, I'd say an array of classes stand a fair chance of winding up next to each other in memory.

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
Quote:
Original post by Narf the Mouse I've pared it back to the very minimum, getting 2.5-2.8 million removes a second, up from 2.0-2.2


2e6 removals / 60 FPS = 33k removals per frame. That's a lot.

Quote:
Anyone got any ideas on where the major slowdowns are?
In algorithm that requires remove at all.

Quote:
And arrays are very slow when deleting objects. Such as bullets, which are created and destroyed very fast.
If one bullet is removed, then it's O(n). If m bullets are removed, then it can still be O(n).

See how remove_if works. It's messy code, but very simple algorithm. Due to sequential nature, it's very fast.

Quote:
other node-based data structures is use a pooled allocation system. You could allocate an array of 1000 nodes, and then all the nodes will be nearby so you'll have good cache locality, but since the elements are still nodes, you can easily delete a node by setting a pointer to null and then linking it into a free list

In C++, not in C#. In C#, the types need to be structs to achieve locality of this type. Hence the above approach.

Not as fast as Dictionary, I must admit. Not by about x5.5.

Or you can sort the item to be removed to the last index in the array and reduce the array count by 1. But there's downsides - Your solution doesn't actually change the array (although you could use the iterator to make a new one) and moving and reducing is not recommended if you want sorted data.

Why would the node solution require that the nodes be structs? Arrays are classes in C#.

Share this post


Link to post
Share on other sites
No-one has also pointed out that a binary tree can have worst-case performance of O(n) for pretty much all its operations if it becomes unbalanced. To even maintain O(log n) performance for things like add/remove, you'd have to use a self-balancing structure, such as a red-black tree. I'm dubious as to how representative your figures are.

How did you implement the iterator for foreach?

Share this post


Link to post
Share on other sites
Quote:
Original post by Narf the Mouse
Well, it was a neat theory, but it didn't pan out. Whatever the "standard" dictionary uses to store data, it isn't an array. At least, not a single array.


It does, actually, but only to implement a hash map as MikeP mentions.

Quote:
Original post by Narf the Mouse
...The MoveNext in my foreach is broken. Further updates as events warrent.


Your benchmarks also show Dictionary iteration being slower by a factor that's suspiciously close to 10. If the numbers are the same upon fixing your stuff, double check that you didn't sneak an extra 0 in somewhere. I'd recommend also posting the benchmark -- suspicious minds like mine aren't likely to be satisfied without seeing it for themselves.

Share this post


Link to post
Share on other sites
Quote:
Original post by MaulingMonkey
Quote:
Original post by Narf the Mouse
Well, it was a neat theory, but it didn't pan out. Whatever the "standard" dictionary uses to store data, it isn't an array. At least, not a single array.


It does, actually, but only to implement a hash map as MikeP mentions.

Quote:
Original post by Narf the Mouse
...The MoveNext in my foreach is broken. Further updates as events warrent.


Your benchmarks also show Dictionary iteration being slower by a factor that's suspiciously close to 10. If the numbers are the same upon fixing your stuff, double check that you didn't sneak an extra 0 in somewhere. I'd recommend also posting the benchmark -- suspicious minds like mine aren't likely to be satisfied without seeing it for themselves.

So removal is simply setting a hash to null somehow? That explains how it's got fast removal.

I posted the benchmarks. Any other benchmark I post, you could argue is somehow wrong. The numbers are there. The Add is fine; the Remove is fine. The MoveNext is way off, for reasons that are in the next paragraph.

In any case, it won't work. Why? The keys. How does it know when to stop when doing a key lookup? I figured out that problem while solving how to fix the MoveNext (Which, by the way, was only parsing to the end of the first Left-only route and then one Right leaf). Fixing MoveNext would simply involve using an Int and leaf counting. However, say you have an item at 101, 1011 and 1010. The item at 101 is now inaccessible, because any key that has 101 at the start will parse down to either 1011 or 1010, for the simple reason it doesn't know how to end.

One last: About the tree not being balanced: Due to the nature of integers, only a maximum of 32 items would ever need to be parsed to get to the last.

In any case, here's the raw dictionary code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Test
{
public class BinaryDictionary<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
private Leaf top;

public BinaryDictionary()
{
top = new Leaf(null);
}

public void Add(TKey key, TValue value)
{
int keyHash = key.GetHashCode();
int depth = 1;
top.Receive(depth, ref key, keyHash, ref value);
}

public void Remove(TKey key)
{
top.RemoveTop(1, ref key, key.GetHashCode());
}

public class Leaf
{
private int childCount;
private bool isNull;
private int keyHash;
private TKey key;
public TKey Key { get { return key; } }
private TValue value;
public TValue Value { get { return value; } }
private Leaf parent;
public Leaf Parent { get { return parent; } }
private Leaf left;
public Leaf Left { get { return left; } }
private Leaf right;
public Leaf Right { get { return right; } }

public Leaf(Leaf parent)
{
this.parent = parent;
childCount = 0;
isNull = false;
}

public void Receive(int depth, ref TKey key, int keyHash, ref TValue value)
{
Leaf leaf = this;
bool done = false;
do
{
++leaf.childCount;
if ((keyHash & (1 << depth)) == 0)
{
if (leaf.left == null)
{
leaf.left = new Leaf(this);
leaf.left.Add(ref key, keyHash, ref value);
done = true;
}
leaf = leaf.left;
++depth;
}
else
{
if (leaf.right == null)
{
leaf.right = new Leaf(this);
leaf.right.Add(ref key, keyHash, ref value);
done = true;
}
leaf = leaf.right;
++depth;
}
} while (!done);

/* ++childCount;
if ((keyHash & (1 << depth)) == 0)
{
if (left == null)
{
left = new Leaf(this);
left.Add(ref key, keyHash, ref value);
}
else
left.Receive(depth + 1, ref key, keyHash, ref value);
}
else
{
if (right == null)
{
right = new Leaf(this);
right.Add(ref key, keyHash, ref value);
}
else
right.Receive(depth + 1, ref key, keyHash, ref value);
} */
}

public bool RemoveTop(int depth, ref TKey theKey, int theKeyHash)
{
Leaf leaf = this;
do
{
if ((theKeyHash & (1 << depth)) == 0)
{
if (leaf.left == null)
return false;
leaf = leaf.left;
++depth;
}
else
{
if (leaf.right == null)
return false;
leaf = leaf.right;
++depth;
}
} while (leaf.keyHash != theKeyHash);
leaf.isNull = true;
return true;

/* if ((theKeyHash & (1 << depth)) == 0)
{
if (left == null)
return false;
return left.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (left.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (left.childCount == 0)
left = null;
--childCount;
return true;
}
return false;
}
else
{
if (right == null)
return false;
return right.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (right.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (right.childCount == 0)
right = null;
--childCount;
return true;
}
return false;
} */
}

public bool Remove(int depth, ref TKey theKey, int theKeyHash)
{
if (keyHash == theKeyHash)
{
this.isNull = true;
return true;
}
else
{
if ((theKeyHash & (1 << depth)) == 0)
{
if (left == null)
return false;
return left.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (left.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (left.childCount == 0)
left = null;
--childCount;
return true;
}
return false; */
}
else
{
if (right == null)
return false;
return right.Remove(depth + 1, ref theKey, theKeyHash);
/* else if (right.Remove(depth + 1, ref theKey, theKeyHash))
{
/* if (right.childCount == 0)
right = null;
--childCount;
return true;
}
return false; */
}
}
}

public void Add(ref TKey key, int keyHash, ref TValue value)
{
this.key = key;
this.keyHash = keyHash;
this.value = value;
}
}

public class Enumerator : IEnumerator<KeyValuePair<TKey, TValue>>
{
BinaryDictionary<TKey, TValue> dictionary;
KeyValuePair<TKey, TValue> kvp;
Leaf leaf;
bool doneLeftLeaf;
bool currentLeafIsRight;

public Enumerator(BinaryDictionary<TKey, TValue> dictionary)
{
this.dictionary = dictionary;
this.leaf = dictionary.top;
}

public KeyValuePair<TKey, TValue> Current
{
get { return kvp; }
}

public void Dispose()
{

}

object System.Collections.IEnumerator.Current
{
get { return kvp; }
}

public bool MoveNext()
{
if (leaf.Left != null && !doneLeftLeaf)
{
leaf = leaf.Left;
currentLeafIsRight = false;
kvp = new KeyValuePair<TKey, TValue>(leaf.Key, leaf.Value);
return true;
}
else if (leaf.Right != null && !currentLeafIsRight)
{
leaf = leaf.Right;
doneLeftLeaf = true;
kvp = new KeyValuePair<TKey, TValue>(leaf.Key, leaf.Value);
return true;
}
else if (leaf.Left == null && leaf.Right == null)
{
currentLeafIsRight = true;
}
if (currentLeafIsRight)
{
bool done = false;
Leaf temp;
do
{
if (leaf.Parent == null)
return false;
temp = leaf;
leaf = leaf.Parent;
if (temp == leaf.Left)
{
currentLeafIsRight = false;
doneLeftLeaf = false;
done = true;
}
} while (!done);
return MoveNext();
}
throw new Exception();
}

public void Reset()
{
leaf = dictionary.top;
}
}

public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return new Enumerator(this);
}

System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return new Enumerator(this);
}
}
}


And the benchmark code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace Test
{
class Program
{
public static System.Diagnostics.Stopwatch stopwatch = System.Diagnostics.Stopwatch.StartNew();
static void Main(string[] args)
{
Dictionary<int, int> dictionary = new Dictionary<int, int>();
List<int> list = new List<int>();
BinaryDictionary<int, int> binaryDictionary = new BinaryDictionary<int, int>();

int count = 3000000;
Console.WriteLine("List<Int>.Add");
double start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
list.Add(t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

Console.WriteLine("Dictionary<int, int>.Add");
start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
dictionary.Add(t, t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

Console.WriteLine("BinaryDictionary<int, int>.Add");
start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
binaryDictionary.Add(t, t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

int c;
Console.WriteLine("for-next List<int>");
start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
c = list[t];
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

Console.WriteLine("foreach Dictionary<int, int>");
start = stopwatch.Elapsed.TotalSeconds;
foreach(KeyValuePair<int, int> kvp in dictionary)
{
c = kvp.Value;
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

Console.WriteLine("foreach BinaryDictionary<int, int>");
start = stopwatch.Elapsed.TotalSeconds;
foreach (KeyValuePair<int, int> kvp in binaryDictionary)
{
c = kvp.Value;
Console.WriteLine(c);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

/* start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
list.RemoveAt(t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start)); */

Console.WriteLine("Dictionary<int, int>.Remove");
start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
dictionary.Remove(t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));

Console.WriteLine("BinaryDictionary<int, int>.Remove");
start = stopwatch.Elapsed.TotalSeconds;
for (int t = 0; t < count; ++t)
{
binaryDictionary.Remove(t);
}
Console.WriteLine(Handy.LoopString(count, stopwatch.Elapsed.TotalSeconds - start));
Console.ReadKey(true);
}
}

public static class Handy
{
public static String LoopString(Int32 loops, Double seconds)
{
return "Loops: " + loops + ", Seconds: " + seconds + ", Loops per Second: " + (loops / seconds);
}


public static String LoopString(Double seconds, Int32 loops)
{
return "Loops: " + loops + ", Seconds: " + seconds + ", Seconds per Loop: " + (seconds / loops);
}
}
}

Share this post


Link to post
Share on other sites
The data structure described here is known as a "bitwise trie".

Quote:
Original post by Rycross
No-one has also pointed out that a binary tree can have worst-case performance of O(n) for pretty much all its operations if it becomes unbalanced. To even maintain O(log n) performance for things like add/remove, you'd have to use a self-balancing structure, such as a red-black tree.
That's assuming a comparison-based structure. Because this algorithm assumes a key range, it can break that bound (just like radix sort).

Share this post


Link to post
Share on other sites
I took a quick stab at writing a new, working enumerator. Then this failed:

Debug.Assert( Current.Parent == null || Current.Parent.Left == Current || Current.Parent.Right == Current );

Worried about the potentially lethal headache that would ensue from trying to fix this code, I decided to move on. The concept is sound, though.

Share this post


Link to post
Share on other sites
...Key searching is possible (to counter my previous thought that it wasn't). With the amount my brain was fried (six hours and a half sleep tonight, gah!), I didn't realize a key search would naturally stop when it finds a matching key.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
The data structure described here is known as a "bitwise trie".


Right, I read through the description too fast and assumed it was a binary tree, rather than a trie.

Share this post


Link to post
Share on other sites
Quote:
Original post by Rycross
Quote:
Original post by Sneftel
The data structure described here is known as a "bitwise trie".


Right, I read through the description too fast and assumed it was a binary tree, rather than a trie.

It is binary; it uses an integer hash code to determine left or right. There's similarities, though.

And thanks, Sneftel. That Bitwise Trie solves a problem I've wondered about; that is, quick fuzzy search.

Share this post


Link to post
Share on other sites
Quote:
Original post by Narf the Mouse
The problem with a standard dictionary, such as that used in .Net, though, is that they use arrays - And arrays are very slow when deleting objects.


You are confused.

What's slow is removing an element from the middle of an array, and shifting everything up to cover the space without leaving any gaps.

When we use an array to implement a dictionary, each element of the array represents a possible result of hashing the key and taking a modulus. We leave gaps all over the place when we do this, by design. We never move anything around when we delete something, because the other elements that haven't been deleted still have the same hash values. Things get moved around when there are hash collisions and/or the table is expanded to accommodate more elements.

A dictionary can also be implemented as a binary search tree, yes; but in this case, there is no real reason to bother with the hashing step at all; we just compare the elements directly, and build (and rebalance) the tree based on the results of those comparisons.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this