Which is faster to find an item in a hashtable or in a sorted list?

A:

Your comparison doesnt make sense as the access mechanisms are incompatible - indexes vs keys.

mP 2009-05-18 09:48:15

Based on the question, it seems safe to assume the key is some property or easily computable function of the value.

Hank Gay 2009-05-18 09:50:22

Keys are useless when working with Lists - you have ignored how lists are accessed.

mP 2009-05-19 11:06:45

He is comparing times, so compare do have meaning.

darko petreski 2010-08-18 09:37:48

+6 A:

Unless the hashing algorithm is extremely slow (and/or bad), the hashtable will be faster.

UPDATE: As commenters have pointed out, you could also be getting degraded performance from too many collisions not because your hash algorithm is bad but simply because the hashtable isn't big enough. Most library implementations (at least in high-level languages) will automatically grow your hashtable behind the scenes—which will cause slower-than-expected performance on the insert that triggers the growth—but if you're rolling your own, it's definitely something to consider.

Hank Gay 2009-05-18 09:49:43

Also the table should be large enough.

Liran Orevi 2009-05-18 09:52:19

Yes! Very important - if your hashtable is getting lots of collisions either due to a bad hash algorithm or a lack of space, then its performance will noticeably degrade!

2009-05-18 09:57:17

+8 A:

The fastest way to find an element in a sorted list is by N-ary search, O(logN), while a hashtable without collissions has a find complexity of O(1).

xtofl 2009-05-18 09:49:51

+1 A:

In some cases, it depends on the size of the collection (and to a lesser degree, implementation details). If your list is very small, 5-10 items maybe, I'd guess the list would be faster. Otherwise xtofl has it right.

Jan 2009-05-18 09:53:56

+13 A:

Algorithm complexity is a good thing to know, and hashtable are know to be 0(1) while sorted vector (in your case I guess it is better to use a sorted array than a list) will provide 0(log n) access time.

But you should know that complexity notation gives you the access time for N going to the infinite. That means that if you know that your data will keep growing , complexity notation give you some hint on the algorthim to chose.

When you know that your data will keep a rather low length: for instance having only a few entry in your array/hastable, you must go with your watch and measure. So have a test.

For intance, in another problem: sorting an array. For a few entries buble sort while O(N^2) may be quicker than .. the quick sort, while it is (n log n) ..

Also, accordingly to other answers, and depending on your item, you must try to find the best hash function for your hashtable instance. Otherwise it may lead to dramatic bad performance for lookup in your hashtable (as pointed out in Hank Gay's answer).

Edit: Have a look to this article to understand the meaning of Big O notation .

yves Baumes 2009-05-18 09:54:53

A:

HashTable would be more efficient for list containing more than 10 items. If the list has fewer than 10 items, the overhead due to hashing algo will be more.

In case you need a fast dictionary but also need to keep the items in an ordered fashion use the OrderedDictionary. (.Net 2.0 onwards)

Rashmi Pandit 2009-05-18 09:57:22

+1 A:

The get operation in a SortedList is O(log n) while the same operation e a HashTable is O(1). So, normally, the HashTable would be much faster. But this depends on a number of factors:

The size of the list
Performance of the hashing algorithm
Number of collisions / quality of the hashing algorithm

bruno conde 2009-05-18 09:57:37

+2 A:

It depends entirely on the amount of data you have stored.

Assuming you have enough memory to throw at it (so the hash table is big enough), the hash table will locate the target data in a fixed amount of time, but the need to calculate the hash will add some (also fixed) overhead.

Searching a sorted list won't have that hashing overhead, but the time required to do the work of actually locating the target data will increase as the list grows.

So, in general, a sorted list will generally be faster for small data sets. (For extremely small data sets which are frequently changed and/or infrequently searched, an unsorted list may be even faster, since it avoids the overhead of doing the sort.) As the data set becomes large, the growth of the list's search time overshadows the fixed overhead of hashing, and the hash table becomes faster.

Where that breakpoint is will vary depending on your specific hash table and sorted-list-search implementations. Run tests and benchmark performance on a number of typically-sized data sets to see which will actually perform better in your particular case. (Or, if the code already runs "fast enough", don't. Just use whichever you're more comfortable with and don't worry about optimizing something which doesn't need to be optimized.)

Dave Sherohman 2009-05-18 10:13:12

ansaurus

tags:

views:

answers:

Which is faster to find an item in a hashtable or in a sorted list?

related questions