views:

265

answers:

7

I have a program that read urls in a file and does a gethostbyname() on each URL host. This call is quite consuming. I want to cache them.

Is there a very simple map-base code snippet in C out there that I could use to do the caching? (I just don't want to reinvent the wheel).

It has to have the following points :

  • Open-source with a permissive license (think BSD or public domain).
  • Very simple : ideally less than 100 LOC
  • Keys are char* and values void*. No need to copy them.
  • No real need to implement remove(), but contains() is either needed or put() should replace the value.

PS: I tagged it homework, since it could be. I'm just being very lazy and do want to avoid all the common pitfalls I could encounter while reimplementing.

+1  A: 

memcached?

Not a code snippet, but a high performance distributed caching engine.

Justin Niessner
-1: I do want to avoid a syscall (`gethostbyname()`), so I don't really think that memcached fits the bill here.
Steve Schnepp
+1  A: 

Not lazy, deeply sensible to avoid writing this stuff.

How's this library never used it myself but it seems to claim to do what you ask for.

djna
The library seems interesting, but the last update to the website was 2005. It would be OK for a few code lines, but a little too old for a full blown library.
Steve Schnepp
Well, fundamental algorithms well implemented should not become dated. I would have no concern about using 4 year old libraries of this kind - assuming that they actually worked in the first place. If you ahve the code, then maintenance should not be too much of an issue.
djna
+5  A: 

A Google search reveals http://www.cl.cam.ac.uk/~cwc22/hashtable/. The code is very simple. It is more than 100 lines, but not by much.

Sinan Ünür
+1: Seems to answer the question, I'll have a look at it.
Steve Schnepp
+3  A: 

std::map in C++ is a red-black tree under the hood; what about using an existing red-black tree implementation in C? The one I linked is more like 700 LOC, but it's pretty well commented and looks sane from the cursory glance I took at it. You can probably find others; this one was the first hit on Google for "C red-black tree".

If you're not picky about performance you could also use an unbalanced binary tree or a min-heap or something like that. With a balanced binary tree, you're guaranteed O(log n) lookup; with an unbalanced tree the worst case for lookup is O(n) (for the pathological case where nodes are inserted in-order, so you end up with one really long branch that acts like a linked-list), but (if my rusty memory is correct) the average case is still O(log n).

Meredith L. Patterson
+1: Seems to answer the question, I'll have a look at it.
Steve Schnepp
A: 

Found an implementation here : c file and h file that's fairly close to what you asked. W3C license

zebrabox
+2  A: 

Here's a very simple and naive one

  • Fixed bucket size
  • No delete operation
  • inserts replaces the key and value, and can optionally free them

:

#include <string.h>
#include <stdlib.h>

#define NR_BUCKETS 1024

struct StrHashNode {
    char *key;
    void *value;
    struct StrHashNode *next;

};

struct StrHashTable {
    struct StrHashNode *buckets[NR_BUCKETS];
    void (*free_key)(char *);
    void (*free_value)(void*);
    unsigned int (*hash)(const char *key);
    int (*cmp)(const char *first,const char *second);
};

void *get(struct StrHashTable *table,const char *key)
{
    unsigned int bucket = table->hash(key)%NR_BUCKETS;
    struct StrHashNode *node;
    node = table->buckets[bucket];
    while(node) {
        if(table->cmp(key,node->key) == 0)
            return node->value;
        node = node->next;
    }
    return NULL;
}
int insert(struct StrHashTable *table,char *key,void *value)
{
    unsigned int bucket = table->hash(key)%NR_BUCKETS;
    struct StrHashNode **tmp;
    struct StrHashNode *node ;

    tmp = &table->buckets[bucket];
    while(*tmp) {
        if(table->cmp(key,(*tmp)->key) == 0)
            break;
        tmp = &(*tmp)->next;
    }
    if(*tmp) {
        if(table->free_key != NULL)
            table->free_key((*tmp)->key);
        if(table->free_value != NULL)
            table->free_value((*tmp)->value);
        node = *tmp;
    } else {
        node = malloc(sizeof *node);
        if(node == NULL)
            return -1;
        node->next = NULL;
        *tmp = node;
    }
    node->key = key;
    node->value = value;

    return 0;
}

unsigned int foo_strhash(const char *str)
{
    unsigned int hash = 0;
    for(; *str; str++)
        hash = 31*hash + *str;
    return hash;
}

#include <stdio.h>
int main(int argc,char *argv[])
{
    struct StrHashTable tbl = {{0},NULL,NULL,foo_strhash,strcmp};

    insert(&tbl,"Test","TestValue");
    insert(&tbl,"Test2","TestValue2");
    puts(get(&tbl,"Test"));
    insert(&tbl,"Test","TestValueReplaced");
    puts(get(&tbl,"Test"));

    return 0;
}
nos
Steve Schnepp
+1  A: 

Dave Hanson's C Interfaces and Implementations includes a nice hash table, as well as many other useful modules. The hash table clocks in at 150 lines, but that's including memory management, a higher-order mapping function, and conversion to array. The software is free, and the book is worth buying.

Norman Ramsey