Is there any speed- and cache-efficient implementations of trie in C/C++? I know what a trie is, but I don't want reinvent the wheel, implementing it myself.
References,
- A Double-Array Trie implementation article (includes a
C
implementation) - TRASH - A dynamic LC-trie and hash data structure -- (a 2006 PDF reference describing a dynamic LC-trie used in the Linux kernel to implement address lookup in the IP routing table
if you are looking for an ANSI C implementation you can "steal" it from FreeBSD. The file you are looking for is called radix.c. It's used for managing routing data in kernel.
Cache optimizations are something you'll probably are going to have to do, because you'll have to fit the data into a single cacheline which generally is something like 64 bytes (which will probably work if you start combining data, such as pointers). But it's tricky :-)
Judy arrays: Very fast and memory efficient ordered sparse dynamic arrays for bits, integers and strings. Judy arrays are faster and more memory efficient than any binary-search-tree (incl. avl & red-black-trees).
I realize the question was about ready implementations, but for reference...
Before you jump on Judy you should have read "A Performance Comparison of Judy to Hash Tables". Then googling the title will likely give you a lifetime of discussion and rebutals to read.
The one explicitly cache-conscious trie I know of is the HAT-trie.
These are both (in my mind) complex data structures. Complexity is bad. If I were after a trie today I'd look for a burst-trie.