This post shows how to implement a simple hash table of arbitrary length,
allowing to store all values c knows and doing so while being as minimal as
possible. It does however not include collision handling, to implement this,
simply swap the Map.buckets
array with an array of linked list and insert
into the linked lists instead of only into the bucket array and you should be
good to go.
1#include <assert.h>
2
3typedef struct Map { size_t size; size_t cap; void **buckets; } Map;
4const size_t BASE = 0x811c9dc5;
5const size_t PRIME = 0x01000193;
6size_t hash(Map *m, char *str) {
7 size_t initial = BASE;
8 while(*str) {
9 initial ^= *str++;
10 initial *= PRIME;
11 }
12 return initial & (m->cap - 1);
13}
14
15Map init(size_t cap) {
16 Map m = {0,cap};
17 m.buckets = malloc(sizeof(void*)*m.cap);
18 assert(m.buckets != NULL);
19 return m;
20}
21
22void put(Map *m, char *str, void *value) {
23 m->size++;
24 m->buckets[hash(m, str)] = value;
25}
26
27void* get(Map *m, char *str) {
28 return m->buckets[hash(m, str)];
29}
Hashing
For the hashing I decided to go with
FNV1a,
simply because its easy to implement and very fast. We are only using strings
as keys, thus we could use the java way of hashing
strings.
The idea is to start with an initial value (BASE
) and assign this to the
xored value of itself and the data (current character). This is then assigned
to its multiplication with the PRIME
.
The last line of the hash
-function includes an optimisation for quicker
modulus computation, it is equivalent to initial % m->cap
but is a lot
faster. It does however only work for the cap being powers of two.
Storing and accessing everything
c allows for storage of all and any values via void*
, I abuse this to only
store the pointers to values, thus allowing the values to be of any kind, the
only downside being, the user of the map has to allocate, ref, deref and cast
the values to the expected types.
1#include <stdio.h>
2#include <stdlib.h>
3
4// ... Map implementation
5
6int main(void) {
7 Map m = init(1024);
8 double d1 = 25.0;
9 double d2 = 50.0;
10 put(&m, "key1", (void*)&d1);
11 put(&m, "key2", (void*)&d2);
12
13 printf("key1=%f;key2=%f\n", *(double*)get(&m, "key1"), *(double*)get(&m, "key2"));
14
15 free(m.buckets);
16 return EXIT_SUCCESS;
17}
This showcases the casting and the ref + deref necessary to interact with the hash table and of course the user has to free the memory used for the table buckets.