An O(1) Sort ~~~

views:

2428

answers:

+8 Q:

Before you stone me for being a heretic, There is a sort that proclaims to be O(1), called "Bead Sort" (http://en.wikipedia.org/wiki/Bead_sort) , however that is pure theory, when actually applied I found that it was actually O(N * M), which is pretty pathetic.

That said, Lets list out some of the fastest sorts, and their best case and worst case speed in Big O notation.

+7 A:

Assumptions is everything if you want to get below NLogN

Assumptions:

Data can be mapped 1:1 to an integer between 0 and X
You know what X is (meaning, X is capped)
You are ok with allocating X*sizeOf(integer) for the purpose of sorting
time to go from item to integer and integer to item is O(1)
Either all items are unique (appear only once) or you don't mind "losing" duplicate items

Algorithm

Allocate an array of size X
For each input
1. Translate input into its unique integer representation (0-X)
2. save it in array[item's-Int]
For arr(0) -> arr(x): if arr(i) <> NULL -> print arr(i)

Total time is O(N)

Answers to questions:

@Jonathan Holland

How would you find the upperbound for the array size?

Notice it was one of the assumptions. If you can't, then this will not work. It is not however to mean this is impossible. Think about the case the key is time in seconds, and the range of time is not beyond the scope of a integer. And you know that there is no chance there is more then one item per second.

csmba 2008-08-24 23:13:19

Given your assumptions, the sort is O(1), since there is a maximum of X items. Even bogosort is O(1) if you have a hard limit on the number of items.

David Thornley 2010-02-09 17:54:17

@David: More specifically, the algorithm is theoretically O(X). The value of N doesn't really matter, what matters is your choice of X. However, on real PC hardware, you probably end up limited by memory bandwidth. You need log(X) bits to store each integer, so the array takes O(X log(X)) memory. So, assuming memory bandwidth is the limit on speed, the algorithm takes O(X log(X)) time. Since X >= N, the algorithm takes at least O(N log(N)) time

user9876 2010-02-09 18:10:44

What's with all the people posting answers to *your* post rather than comments?

Wallacoloo 2010-05-01 22:50:30

I'll add another asumption to that list: Assume the array is already sorted.

Joe D 2010-09-01 17:56:52

+6 A:

There is a sort that proclaims to be O(1)

It doesn't actually, at least not on a von Neumann architecture. There are several other algorithms that achieve O(n) on positive integer sets (actually, the running times are more complex), e.g. counting sort and radix sort.

Konrad Rudolph 2008-08-24 23:13:31

They can't be O(n) if they keys aren't restricted beyond "integers": consider bit-by-bit radix sort, with longer and longer keys---you have to do bitlength-of-the-longest-key "rounds" of radix bucketing.

Jonas Kölker 2009-06-02 21:28:30

Jonas: absolutely true, which is why I stated that “the running times are more complex.” Just to mention a true Θ(n) algorithm, consider the Dutch flag sort.

Konrad Rudolph 2009-06-03 08:44:17

@csmba:

How would you find the upperbound for the array size? I suppose this would work great if you were always working with sets of 50 whole numbers. Nicely done.

FlySwat 2008-08-24 23:15:32

+2 A:

For parallel computing you can get O(1) sort, using specialized hardware. You can also do O(log n) sorting, but that requires n^2 processors.

Kibbee 2008-08-24 23:19:36

nice jock.i need 40K quad core processors for my lists witch i sort in 1ms.

Behrooz 2009-12-10 11:45:10

By best case, you probably meant average case? I kind of like Bead Sort. Theory says its impossible to achieve better than nlogn sort with comparison-based algorithm, Bead Sort reminds that there might be alternative.

By the way, here is a video about Quick Sort ( in best case : O(n log n), worst case : O(n^2))

poulejapon 2008-08-24 23:27:24

+1 A:

There IS an O(1) in time sort, It's O(n^2) or O(n Log N) in processors and O(n*2^(n^2)) or O(n*2^(n log n)) in space and in practice is limited in the size it can sort (but in practice so are all sorts)

Each CPU grabs 2 numbers from the input, either 1 CPU for every possible pair or some selected subset
The CPU's do a single compare
Each CPU drives one address line of a HUGE memory cache pre-loaded with a data set
The single memory value addressed is a sorting map.
N CPU's each grab a single element from the map
Each CPU loads that element from the input
Each CPU saves the loaded element to the output in the correct place

Not practical in reality but it could be used for small arrays in ASIC's or the like.

BCS 2008-08-24 23:36:53

@csmba I think there is a typo. The last line should read : For arr(0) -> arr(x): if arr(i) <> NULL -> print i

Pretty interesting! Bead sort is also considering the upperbound of u(n) as a ressource.

poulejapon 2008-08-24 23:42:19

The wiki link has the disclaimer that the bead sort works well on specialized hardware, but software implementations tend to perform poorly.

I think the question should be clarified to include only software based sorting algorithms, but not software implementations of hardware sorting.

Jason Z 2008-08-25 00:12:39

radix sorting is kind of interesting.

It's an O(n) sort (not comparison-based) that works with integers, though it can be extended to work with floating point values and even strings (with strings, it's O(n*m) where m is the string length).

A variant named the "american flag" sort allows the sort to be substantially in-place, too.

The most interesting thing is that practical implementation is possible that exhibits constant factors good enough to keep it competitive with traditional comparison-based sorts.

DrPizza 2008-08-25 00:27:18

+42 A:

Quantum Sort

I've developed the Quantum Sort which is not only O(1) but is actually O(0) for any size data set. It generates a quantum state for every permutation of the array, and when you observe it it collapses to the sorted state.

The implementation is proving a bit tricker than I imagined however.

Mark Harrison 2008-08-25 00:28:56

I have a truly marvelous implementation of this proposition which this margin is too narrow to contain

Kyle Cronin 2008-11-25 14:29:58

@nobody, I love that reference!

Simucal 2009-02-02 05:52:07

@nobody: I've up-voted the answer because of of your implicit reference alone!

Arafangion 2009-06-04 04:52:13

I would upvote this, but I like it being stuck at forty two votes, it suits the answer.

Joe D 2010-08-29 18:53:06

On a dataset that's already in a sequential block of memory, you could effectively get O(x) (where x is a constant independent of n) on a GPU for a radix or bucket sort. Of course the data would need to be suitable for these sorts. Since a modern GPU will let you do a lot of operations in parallel, you could process the whole sort in constant number of cycles. This would be kind of a toy solution though, since it would go above constant time for any data set that has more elements than the GPU has processing units. Some people have had a great deal of success writing fast sorting routines on GPU's though.

Dana the Sane 2008-08-25 01:01:54

Indeed, somebody has just created a sort program that can use a GPU to sort 1e9 ints in a second: http://code.google.com/p/back40computing/wiki/RadixSorting

Gabe 2010-08-30 04:29:28

Before you stone me for being a heretic

Burn it with fire!!!

Kevin Pang 2008-08-25 08:32:07

+3 A:

Actually it also depends where you sort. When sorting in memory you can use any of the well known sorting algorithms.
However when sorting must be done on disk you have to take into account the total time you have to wait for the disk to find and read the data (seektime + latency + transfer). Therefore the best searching algorithms are the ones that take into account the block size and have a good caching behaviour. For this you can best use the merge sort algorithm.

On a side note, in (fast) memory, binary trees are not slower than n-ary trees, because the time you need looking up a single level in an n-ary tree will kill the performance boost you get by the lower number of levels you have to traverse. This is not the case for disk-based datastructures. One of the most famous of those are b-trees which are in fact n-ary trees used in databases, file systems etc.

Sven 2008-08-25 09:05:48

@csmba:

That's actually a hash table, wich doesn't behave like a list and in fact doesn't require "sorting". Besides, stating "Data can be mapped 1:1 to an integer between 0 and X" isn't precisely a ligh weigh requisite and can prove to be difficult... specially if you don't know the data in advance.

Jorge Córdoba 2008-08-25 10:28:17

"Data can be mapped 1:1 to an integer between 0 and X"Not at all. Each and every piece of data stored by a computer with finite memory satisfies this property.

fishlips 2010-01-20 17:32:59

+5 A:

@lassevk:

There is one type of sorting that is guaranteed to be O(1), and that is sorting two numbers:

You fail to grasp the concept of the big O notation. Even sorting 10000 numbers is O(1)! If the input size is fixed, the time the algorithm takes is always the same. As long as the input size n is bounded by a finite number k (which is constant), you get a runtime of O(k) = O(1) by definition of O(n). The big O notation cannot be applied meaningfully this way.

Konrad Rudolph 2008-08-25 10:45:02

Radix Sort can be done very, very quickly. Using Loki's typelists you can specialise it for signed and unsigned values.

graham.reeds 2008-08-25 11:00:47

Two of my favorite sorting algorithms are Randomized quicksort and Merge sort, both of which have an expected worst-case running time of O(n lg n). They're also both fairly simple to implement, which is an added bonus.

Josh Brown 2008-08-26 00:47:08

+7 A:

Let's think about what O(1) really means: that the sorting time is constant regardless of the number of items to be sorted. It doesn't necessarily mean that this constant time is very fast. If you have an algorithm that sorts a billion items in the same amount of time it sorts 10 items, I'll bet its very slow for the 10 item scenario. So even if you could do it, it might not be very desirable.

Joel Coehoorn 2008-09-02 18:22:06

Bogosort is both hysterically funny and O(1) if you implement the "Quantum" version ....

JosephStyons 2008-09-17 21:04:47

+5 A:

As @csmba says, all lies in assumptions.

Now that we are making assumptions, lets asume that the data is sorted. Algorithm:

Leave things as they are.

Here you have! An O(1) algorithm.

paradoja 2008-10-11 20:08:03

Burst sort (or rather its derivative C-burstsort) is kind of interesting. It's basically an O(n log n) algorithm but improves in scaling by utilizing CPU-cache better.

John Nilsson 2008-10-12 11:03:13

ansaurus

tags:

views:

answers:

An O(1) Sort ~~~

Assumptions:

related questions