views:

283

answers:

5

Please mention time complexity and best data structure to store these values, when values are:

  1. Integers
  2. Strings (dictionary like sorting)

I know Counting sort is preferred when integers are in a small range.

Thanks.

Edit: Sorry, I asked a bit different question. Actual question is what would be the best data structure to store these values, if the integers are phone numbers (and strings are names) and then find the best sorting algorithm.

+2  A: 

Sorting algorithms wiki link: Sorting Algorithm Wiki

Merge sort and quick sort are pretty good, they are n log n in best cases.

Holystream
@Holystream: I think I didn't ask my question correctly. Focus was on storing the values.
understack
+1  A: 

Have a look at: Btrees and red-black trees.

You should be able to find open source implementations of each of these. (Note, I'm assuming that you want to maintain a sorted structure, rather than just sorting once and forgetting.)

Rob Lachlan
+1  A: 

How about a heap? Relatively easy to implement and pretty fast. For strings, you could use a Trie along with something like Burst sort which is supposedly the fastest string sorting algorithm in its class.

CD Sanchez
A: 

For most sorting algorithms there is an in-place version, so a simple array may be sufficient. For Strings you may consider a http://en.wikipedia.org/wiki/Trie, which could save space. The right sorting algorithm depends on a lot of factors, e.g. if the results may be already sorted or partially sorted. Of course if you have just a few different values, Countingsort, Bucketsort etc can be used.

Landei
A: 

On a 32-bit machine, a million integers can fit in an array of 4 million bytes. 4MB isn't all that much; it'll fit in this system's memory 500 times over (and it's not that beefy by modern standards). A million strings will be the same size, except for the storage space for those strings; for short strings it's still no problem, so slurp it all in. You can even have an array of pointers to structures holding an integer and a reference to a string; it will all fit just fine. It's only when you're dealing with much more data than that (e.g., a billion items) that you need to take special measures, data-structure-wise.

For sorting that many things, choose an algorithm that is O(nlogn) instead of one that is O(n2). The O(n) algorithms are only useful when you've got particularly compact key spaces, which is pretty rare in practice. Choosing which algorithm from the set that are in O(nlogn) is a matter of balancing speed and other good properties such as stability.

If you're doing this for real, use a database with appropriate indices instead of futzing around with doing this all by hand.

Donal Fellows