HashMap in 25 lines of C

· 438 words · 3 minute read

This post shows how to implement a simple hash table of arbitrary length, allowing to store all values c knows and doing so while being as minimal as possible. It does however not include collision handling, to implement this, simply swap the Map.buckets array with an array of linked list and insert into the linked lists instead of only into the bucket array and you should be good to go.

 1#include <assert.h>
 2
 3typedef struct Map { size_t size; size_t cap; void **buckets; } Map;
 4const size_t BASE = 0x811c9dc5;
 5const size_t PRIME = 0x01000193;
 6size_t hash(Map *m, char *str) {
 7    size_t initial = BASE;
 8    while(*str) {
 9        initial ^= *str++;
10        initial *= PRIME;
11    }
12    return initial & (m->cap - 1);
13}
14
15Map init(size_t cap) {
16    assert(cap%2==0);
17    Map m = {0,cap};
18    m.buckets = malloc(sizeof(void*)*m.cap);
19    assert(m.buckets != NULL);
20    return m;
21}
22
23void put(Map *m, char *str, void *value) {
24    m->size++;
25    m->buckets[hash(m, str)] = value;
26}
27
28void* get(Map *m, char *str) {
29    return m->buckets[hash(m, str)];
30}

Hashing ##

For the hashing I decided to go with FNV1a, simply because its easy to implement and very fast. We are only using strings as keys, thus we could use the java way of hashing strings. The idea is to start with an initial value (BASE) and assign this to the xored value of itself and the data (current character). This is then assigned to its multiplication with the PRIME.

The last line of the hash-function includes an optimisation for quicker modulus computation, it is equivalent to initial % m->cap but is a lot faster. It does however only work for the cap being powers of two, this is why i am asserting this criteria in the init function.

Storing and accessing everything ##

c allows for storage of all and any values via void*, I abuse this to only store the pointers to values, thus allowing the values to be of any kind, the only downside being, the user of the map has to allocate, ref, deref and cast the values to the expected types.

 1#include <stdio.h>
 2#include <stdlib.h>
 3
 4// ... Map implementation
 5
 6int main(void) {
 7    Map m = init(1024);
 8    double d1 = 25.0;
 9    double d2 = 50.0;
10    put(&m, "key1", (void*)&d1);
11    put(&m, "key2", (void*)&d2);
12
13    printf("key1=%f;key2=%f\n", *(double*)get(&m, "key1"), *(double*)get(&m, "key2"));
14
15    free(m.buckets);
16    return EXIT_SUCCESS;
17}

This showcases the casting and the ref + deref necessary to interact with the hash table and of course the user has to free the memory used for the table buckets.