views:

604

answers:

13

This is a derivative question, but I'm inquiring as to the data structures that you should at least be familiar with for their usefulness. These structures are too hard to implement without some expertise however.

I would say a good boundary between the two is a heap -- you should be able to code a heap, but it would take you a day. Not appropriate for this would be a BST, etc. Edit: I see the point that it depends on what you are doing. I think it would be awesome to have a list with a phrase summarizing why you use it!

Here's a list to start:

  1. B+ trees: good general indexing structure on a single key
  2. K-d tree: spatial data
  3. Red-black tree: self-balancing BST; also AVL or splay tree
  4. Skip list: good hybrid structure for either random or (pseudo)sequential access
  5. Trie: linear time string search
+2  A: 

That is a good start; there is a comprehensive list of data structures on wikipedia, some of them should be examined. But as to which ones you need, that depends on the area you intend to... do whatever it is that you are doing.

Embedded systems guys will have very different ideas from web guys who will strongly disagree with the business logic guys. Figure out what you want to do; languages and platform will also effect the list you need.

andy K
+3  A: 

What about:

CMS
Disjoint set is really easy to implement though. It's just hard to analyze.
Novelocrat
A: 

I would add Hash Tables to the list. They are pretty simple in concept, but can be complicated once you look at how to implement a good hashing function and efficient probing methods.

MahlerFive
A: 

To quote Martin Kay:

Suffix trees constitute a well understood, extremely elegant, but regrettably poorly appreciated data structure with potentially many applications (...)

See also: What are the lesser known but cool data structures?

Fabian Steeg
+8  A: 

Bloom filters

twk
A: 

R-Tree

Hank Gay
A: 

Finger tree

Hank Gay
A: 

Closely related to the B+ tree you mentioned: B*-tree. Along with a balancing approach known as the "dancing tree" approach, these form the basis of Reiser4.

Hank Gay
A: 

Binary Decision Diagrams, specifically Reduced Order Binary Decision Diagrams (ROBDD). These get reinvented (poorly) a lot when someone decides to create their own filtering system.

Hank Gay
A: 

van Emde Boas trees. I don't literally think that you "should" have heard of them, but I do believe they're an interesting example of what kind of complexity you can achieve with "bit tricks" --- namely O(log log n), exponentially better than binary trees!

A. Rex
A: 

Cuckoo hashing, a simple and elegant way of resolving hash-table collisions in expected constant time.

A. Rex
+1  A: 

Deterministic finite automata (DFAs), or finite state machines, useful for expressing many things, such as basic lexers, regular expressions, state transitions, etc. See also the related directed acyclic word graphs, which can be useful for storing dictionaries compactly.

A. Rex
A: 

Finger trees

Casebash