I do not understand the code (Hashtable)

Started by
1 comment, last by Datamancer 9 years, 6 months ago

Hi,

i'm learning from http://learnpythonthehardway.org/book/ex39.html


def new(num_buckets=256):
    """Initializes a Map with the given number of buckets."""
    aMap = []
    for i in range(0, num_buckets):
        aMap.append([])
    return aMap

def hash_key(aMap, key):
    """Given a key this will create a number and then convert it to
    an index for the aMap's buckets."""
    return hash(key) % len(aMap)

def get_bucket(aMap, key):
    """Given a key, find the bucket where it would go."""
    bucket_id = hash_key(aMap, key)
    return aMap[bucket_id]

def get_slot(aMap, key, default=None):
    """
    Returns the index, key, and value of a slot found in a bucket.
    Returns -1, key, and default (None if not set) when not found.
    """
    bucket = get_bucket(aMap, key)

    for i, kv in enumerate(bucket):
        k, v = kv
        if key == k:
            return i, k, v

    return -1, key, default

def get(aMap, key, default=None):
    """Gets the value in a bucket for the given key, or the default."""
    i, k, v = get_slot(aMap, key, default=default)
    return v

def set(aMap, key, value):
    """Sets the key to the value, replacing any existing value."""
    bucket = get_bucket(aMap, key)
    i, k, v = get_slot(aMap, key)

    if i >= 0:
        # the key exists, replace it
        bucket[i] = (key, value)
    else:
        # the key does not, append to create it
        bucket.append((key, value))

def delete(aMap, key):
    """Deletes the given key from the Map."""
    bucket = get_bucket(aMap, key)

    for i in xrange(len(bucket)):
        k, v = bucket[i]
        if key == k:
            del bucket[i]
            break

def list(aMap):
    """Prints out what's in the Map."""
    for bucket in aMap:
        if bucket:
            for k, v in bucket:
                print k, v

i searched for hash function, hash table,hashmap

return hash(key) % len(aMap)

How does it work?

& what he mean with the bucket when i read about hash i think i couldn't understand what he mean with bucket

(Given a key this will create a number and then convert it to

an index for the aMap's buckets)

but i want to know how does it work

i run it and tried many times I understood things, and things I did not understand

& if u don't understand what i mean please tell me smile.png & sry for asking

i will try again ....

Thanks

Advertisement

As far as how each line works, I am not particularly familiar with python.

In this example, it looks like there are 256 "buckets" or slots to put an object or indices in an array available.

So, no matter what object is placed into the hash map the key needs to be in the range of zero to 255.

hash(key) deterministically maps the key object to a numeric value. (The actual logic of the hash function doesn't affect the functionality of the hash map very much but it can affect the performance/efficenty/clash rate)

Then, that numeric representation of the key is placed within the range of the available indicies (0-255 in this case).

For an Example, Assume we want to use a hashmap to store strings.

First we define a hashing function... a function to convert a key to an integer, for simplicity we will just sum the integer values of the ascii codes for the letters forming the string. (This is not the best hash function, but it's simple to explain)

So: hash("Bob") = 66+111+98 = 275 we have an array that has 256 separate memory locations, 275%256=19

So we put the string "Bob" into bucket 19.

The advantage is that using the hash "Bob" will always end up in bucket 19, when we want to retrieve the string "Bob" we don't have to search the array... we simply calculate the hash and do a direct look up to grab the contents of bucket 19.

The disadvantage is that the String "Coa" will also map to bucket 19 (Which is why simply summing the ascii values is a bad hashing function)

You want to find a hash function that maximizes distribution and minimizes collisions.

In a hashtable, a "bucket" is just a fancy name for an index of the array. They are also commonly known as "slots." Other than that "AuthenticOwl" nicely sums up how a hashtable works.

EDIT: And as "AuthenticOwl" mentioned, summing the ASCII values of a string is usually not an ideal approach to hashing as there is a high risk of collision. Also, if you want to understand how the code works, take it line by line.

"The code you write when you learn a new language is shit.
You either already know that and you are wise, or you don’t realize it for many years and you are an idiot. Either way, your learning code is objectively shit." - L. Spiro

"This is called programming. The art of typing shit into an editor/IDE is not programming, it's basically data entry. The part that makes a programmer a programmer is their problem solving skills." - Serapth

"The 'friend' relationship in c++ is the tightest coupling you can give two objects. Friends can reach out and touch your privates." - frob

This topic is closed to new replies.

Advertisement