tags:

views:

91

answers:

6

Hey,

I'm looking to do a custom hash table implementation in C. Is there an MD5/SHA1 hash function already in the GNU library or do I have to use an external library for this?

Here's kinda what I'm looking for:

int hashValue;

hashValue = MD5_HASH(valToHash);
+2  A: 

There are a few trusted, simple versions available -- I have a few in the sources of the digest for R. Here is what I wrote in the DESCRIPTION file:

Description: The digest package provides functions for the creation of `hash' digests of arbitrary R objects using the md5, sha-1, sha-256 and crc32 algorithms permitting easy comparison of R language objects. The md5 algorithm by Ron Rivest is specified in RFC 1321, the SHA-1 and SHA-256 algorithms are specified in FIPS-180-1 and FIPS-180-2, and the crc32 algorithm is described in
ftp://ftp.rocksoft.com/cliens/rocksoft/papers/crc_v3.txt. For md5, sha-1 and sha-256, this packages uses small standalone implementations that were provided by Christophe Devine. For crc32, code from the zlib library is used.

I think some of Christophe's code is no longer at cr0.net, but searches should lead you to several other projects incorporating it. His file headers were pretty clear:

/*                                                   
 * FIPS-180-1 compliant SHA-1 implementation,   
 * by Christophe Devine <[email protected]>;   
 * this program is licensed under the GPL.  
 */     

and his code matches the reference output.

Dirk Eddelbuettel
+2  A: 

Unless you already have a good reason for using MD5, you may want to reconsider. What makes for a "good" hash function in a hash table is pretty dependent on what you're trying to accomplish. You may want to read the comments in Python's dictobject.c to see the sorts of tradeoffs others have made.

Hank Gay
A: 

Glibc's crypt() uses a MD5 based algorhytm if salt starts with $1$. But since you mention that you are going to do a hash table implementation, maybe Jenkins hash would be more appropiate.

ninjalj
A: 

The OpenSSL library has all the crypto routines you could ever want, including cryptographic hashes.

Chris
+2  A: 

For a hash table, you do not need cryptographic strength, only good randomization properties. Broken cryptographic hash functions (like MD5) are fine for that, but you may want to use MD4, which is both faster and simpler, to the point that you could simply include an implementation directly in your code. It is not difficult to rewrite it from the specification (and since you want only a function for a hash table, it is not really a problem if you get it wrong at some point). Shameless plug: there is an optimized C implementation of MD4 in sphlib.

Thomas Pornin
+2  A: 

You can take a look at Bob Jenkin's survey and analysis of many hash functions:

Or just drop his lookup3 routines (which he's put into the public domain) into your project:

Michael Burr