Hash Table: Linked List Implementation

Amiay Narayan
6 min readApr 27, 2021

Recently I started revisting my Data Structure and Algorithms basics. And the first topic that you generally encounter is …? Arrays and Linked Lists( yes you guessed it right!). One of the better examples of usage of Arrays is quite honestly presented by the implementation of Hash Tables using ‘an array of linked list’. In this short blog I will be implementing a Hash Table in python using an array of linked lists.

I will be implementing my own version of node, which will be then used to create a Linked List class and then we will move from there to create our own Implementation of a Hash Table. I have tried to keep this these implementation as minimalistic as possible so that everyone can follow along. My main aim here is to demonstrate a basic working of a Hash Table. However, I would recommend you to brush your basics of Linked List(insertion) first.

So let the fun begin!

# Class(blueprint) for node 
class node():
def __init__(self, key, value, next_node=None):
self.key = key
self.value = value
self.next_node = next_node

def set_next(self, next_node):
""" sets the pointer to which current point"""
self.next_node = next_node
def __str__(self):
"""Defines how print() will behave"""
return f'{self.key, self.value}'

Hash Tables stores a key-value pair. And one can lookup for the value with the help of the key in O(1), so it is intended. This is the reason we are storing key:value pair in the node object. This will get more clear once we get to the hash table implementation.

A node object pointing to null

Next we start implementing the Linked List using the node we just defined.

class Linked_list():

def __init__(self):
self.head = None
self.tail = None
self.size = 0

def insert(self, key, value):
if not (key and value):
print('Key or value cannot be None Type')
return
new_node = node(key, value)
if self.size == 0:
self.head = new_node
self.tail = new_node
self.size = 1
print("First item inserted")
return
self.head.next_node = new_node
self.head = new_node
self.size += 1
print("Item inserted successfully")
return

def peek(self):
print(self.head)
Linked List with 3 nodes

Since I promised to keep things minimal, therefore our linked list class has only one method: insert - to insert into linked list and eventually into the hash table. There you go! We are done with the preparation. Let us now get straight into the meat of the matter. The code below is the blueprint of the HashTable() class.

class HashTable():
def __init__(self):
self.size = 6
self.ll_array = [Linked_list() for _ in range(self.size)]
print("An empty hash table created")

def _show_sizes(self):
for l in self.ll_array:
print(l.size)

def insert(self, key, value):
# method to insert into the Hash Table
pass

def look_up(self, key):
# looks up for key in the Hash Table
pass

def hash(self, key):
# generates has for a given key
"""
>>>'amiay'.__hash__()
5169131434226628243
"""
return key.__hash__()

The snippet above lays out the tasks which we want to achieve, namely insertion (of a key-value pair) and the lookup (using a key). This is good time to describe how exactly does the hash table in our implementation work. A “hash” table works by hashing the key using some desired hash function. Suppose, we wanted to store {‘amiay’: ‘captain’} as our first entry into the hash table. Therefore we have the key as — ‘amiay’. This ‘amiay’ string will be fed into some desired and proven hashing function to genereate some hash code. In python, __hash__() comes out of the box which can hash a string and produces a hash code. For example, in the snippet above, docstring shows that the hash code for ‘amiay’. Also, as described in the __init__() method, we have an array of Linked_list objects(array size : 6, could be anything you want). Why Linked list? Why not a simple array of nodes? Good question. This is because of something called ‘collisions’. After hash code generation, the code is then mapped to a particular index in the array. There could be multiple ways of doing that. A simple way is just to take modulo(%) with the capacity of the array that we are defining in the __init_()[self.size].There could be two reasons for a collision. First, two different strings might end up having the exact same hash(though today’s hash functions are reliabe, however it is not impossible for collisions to occur), which eventually leads to the same index. Second, two different hash codes can also end up mapping to the same index in the array. For example,say, hash1 = 1001, hash2 = 51 and self.size in HashTable is ‘50’ , then for both of these hashes the mapped index will be 1. Therefore, there is indeed a requirement for handling these collisions.

Enter Linked Lists… In case of a collision, we will just insert another node at the end of the linked list present at that index(we will call the insert method on the linked at the index). Problem solved!? Not quite yet. Don’t forget we also have to carry out the look up. Hang on, we are almost there. So whenever we encounter a key that we want to lookup in the hash table. We first hash it using the same hashing function, generate the index (say, by taking modulo with self.size of hash table). Now if the size of the linked list at that index is 0, 1 or >1. If size is 0, then the key is invalid. If size is 0 we return the first node. If the size is greater than one, we have to perform a linear search on the keys that are stored on that particular linked list. Therefore, if we had a really bad hashing function that maps every key to the same index, then the size of this linked list at this index will be n if there were n inputs. And hence, it would take O(n) for search for a key implying an O(n) time comlexity. A pictorial representation is below:

Hash Table handling collisions

And this is it! We did it. We have our own simple and straight forward implementation of a hash table, where we can add and look up key-value pairs. Below is the complete implementation of what we just came up with.

class HashTable():
def __init__(self):
self.size = 6 # setting the size is desgin decision
self.ll_array = [Linked_list() for _ in range(self.size)]
print("An empty hash table created")

def _show_sizes(self):
""" Prints the state of linked list array"""
for l in self.ll_array:
print(l.size)

def insert(self, key, value):
key_hash = self.hash(key)
index = key_hash % self.size
print(f'{key} kept at index {index}')
self.insert_on_arr(key, value, index)
#self._show_sizes()
# uncomment to see the working of the linked list

def insert_on_arr(self, key, value, index):
self.ll_array[index].insert(key, value)

def look_up(self, key):
# find hash of key
key_hash = self.hash(key)
index = key_hash % self.size
#print(f'{key} hashes at index {index}')
ll_size = self.ll_array[index].size
print(ll_size, ' is the size')
if ll_size == 0:
print(f'{key} has not yet been inserted')
return
elif ll_size == 1:
if self.ll_array[index].head.key == key:
print(self.ll_array[index].peek())
return
else:
print(f"There was a collision, the hash for '{self.ll_array[index].head.key}' and '{key}' are the same")
return
else:
traverser = self.ll_array[index].tail
# linear search for the key => O(n)
while traverser:
if traverser.key == key:
print(traverser.value)
return
traverser = traverser.next_node

print(f"There was a collision, the hash for '{self.ll_array[index].head.key}' and '{key}' are the same")


def hash(self, key):
"""
>>>'amiay'.__hash__()
5169131434226628243
"""
return key.__hash__()

Thank you will reading, this far. This is my first blog ever. Hope it was clear. Please do post your feedback and thoughts.

Resources:

All the diagram made using Inkscape

The GitHub repo for the code.

What’s a Linked List, Anyway? — Vaidehi Joshi

Hash function — Wikipedia

--

--