Making your own keys for a hash table.

Started by
6 comments, last by Oxyacetylene 18 years, 9 months ago
I have no idea how a hash table genrates its keys. What I want to do is make parser and put symobls into a symbol table. I understad a hash would be good for this. The problem is collisions. So what if i assigned a number to each symbol and use that as its hash key? Symbol 1 Symbol 2 Symbol 3 Symbol 4 Basically number them in the order encounterd. Use the order number as the hash ID? Would this work? Or what would be a good data strutue to store symbols by id number or symbol length. Thanks.
Advertisement
Quote:Original post by ThoughtCriminal
I have no idea how a hash table genrates its keys. What I want to do is make parser and put symobls into a symbol table. I understad a hash would be good for this. The problem is collisions. So what if i assigned a number to each symbol and use that as its hash key?

Symbol 1
Symbol 2
Symbol 3
Symbol 4

Basically number them in the order encounterd. Use the order number as the hash ID?

Would this work?

Or what would be a good data strutue to store symbols by id number or symbol length.

Thanks.


If you want to store it by ID number and you don't mind using a sequential ID (ie 0 to N) then a good old vector is by far the best choice. Just use the index of the vector to represent the element.

However, if you remove/add a lot of elements, you might want to look into hashtables, like you mentionned. Anything you fancy can be used as an hash
Keep in mind that unless you really want to implement it yourself, pretty much every modern language provides hashmaps/hashtables facilities in their standard library. Should you desire to implement your own hash table and hash function, google will be your best tool.
I teleported home one night; With Ron and Sid and Meg; Ron stole Meggie's heart away; And I got Sydney's leg. <> I'm blogging, emo style
Well the parser would be for an assembler. You dont really remove alot in that case.

The development langguage will also be assembler.

Is a vector anythin like a linked list?

Or is there another data stucture that descibes a vector?

Thanks.

Wait put it this way I could use my own IDs as a hash value?

If that is the case a hash libary with C linkage should suffice.
A vector is basically C++ian for "array that resizes itself for me". I would strongly suggest you look up some books/tutorials on hashtables and other data structures, and if you're writing an assembler, some more on assemblers (and maybe compilers) while you're at it.

As for a symbol table, as xMcBaiNx said, just about every language out there should have a hashtable implementation. If you're writing it yourself and don't have to remove anything, a simple binary tree may actually work pretty well. If you don't really care about performance, and you shouldn't (yet), just use an array and do it in such a way it's easy to replace later.

You said you're writing in assembler; I suggest you don't, unless you really have to or really want to.

Anyway, good luck!
-----http://alopex.liLet's Program: http://youtube.com/user/icefox192
An Introduction to Hash Tables with Chaining might be a useful read for you.
Google for perfect hashing..
Quote:Original post by ThoughtCriminal
Well the parser would be for an assembler. You dont really remove alot in that case.


Using a hash table may not be the best approach. Unless you intend on removing symbols during the assembly process you might get better results using a binary tree instead. Take a look at red/black binary trees or splay trees.


Quote:The development langguage will also be assembler.


Writing any application in assembler is painful. I wish you luck!



This is slightly off topic, but I thought it might be useful.

If you want to generate a reasonably unique hash for a string, you might want to look into the CRC32 algorithm. The boost library has an implementation of it.

I use it in my own projects for resource management. I use the CRC32 algorithm to generate a 32bit integer hash from the filename, and use that to look up the resource.

This topic is closed to new replies.

Advertisement