ansaurus

Question

O(N log N) Complexity - Similar to linear?

Answer 1

+23 A:

Make the graph bigger and you'll see that O(n logn) isn't quite a straight line. But yes, it is pretty near to linear behaviour. To see why, just take the logarithm of a few very large numbers.

For example (base 10):

log(1000000) = 6
log(1000000000) = 9
…

So, to sort 1,000,000 numbers, an O(n logn) sorting adds a measly factor 6 (or just a bit more since most sorting algorithms will depend on base 2 logarithms). Not an awful lot.

In fact, this log factor is so extraordinarily small that for most orders of magnitude, established O(n logn) algorithms outperform linear time algorithms. A prominent example is the creation of a suffix array data structure.

A simple case has recently bitten me when I tried to improve a quicksort sorting of short strings by employing radix sort. Turns out, for short strings, this (linear time) radix sort was faster than quicksort, but there was a tipping point for still relatively short strings, since radix sort crucially depends on the length of the strings you sort.

Konrad Rudolph 2009-06-07 19:02:33

Fantastic answer, thanks for clearing this up for me. Just reading through your post now, really interesting stuff.

gav 2009-06-07 23:02:48

Good sorts tend to go for a linear algorithm once they have divided and conquered into sufficiently small pieces. Exactly how small is a matter of benchmarking (real data).

Tom Hawtin - tackline 2009-06-08 16:48:09

Tom: I'm not sure what exactly you mean by linear. Often, sorting algorithms do the opposite, using O(n^2) sortings such as insertion sort on small portions, since their constant factor is so small that even quadratic runtime outperforms nlogn sorting. On the other hand, introsort uses a strategy to break out of too deep recursions – but again, this isn't anywhere linear, it just exchanges a quadradic worst case for O(n logn) behaviour.

Konrad Rudolph 2009-06-08 19:40:21

Answer 2

+2 A:

log(N) is (very) roughly the number of digits in N. So, for the most part, there is little difference between log(n) and log(n+1)

James Curran 2009-06-07 19:05:35

log-base-*10* is very roughly the number of digits in N (assuming you're using the decimal representation). Most sort/search algorithms would be using log-base-2 which, while proportional to log-base-10 (so the big-O still applies), is nothing like what you describe :-)

paxdiablo 2009-06-08 01:45:14

Another way to say it is that log-base-2 is roughly the number of digits in N when written in binary, aka the number of bits required to represent N.

Tyler McHenry 2009-08-02 16:33:41

Answer 3

A:

Try plotting an actual linear line on top of it and you'll see the small increase. Note that the Y value at 50,0000 is less than the 1/2 Y value at 100,000.

It's there, but it's small. Which is why O(nlog(n)) is so good!

Chris Harris 2009-06-07 19:06:04

It's still a damn sight better than O(n^2).

paxdiablo 2009-06-08 01:46:19

Answer 4

A:

Usually the O( n*log(n) ) algorithms have a 2-base logarithmic implementation.
For n = 1024, lg(1024) = 10, so n*lg(n) = 1024*10 = 10.240 calculations, an increase by an order of magnitude.

So, O(N*log N) is similar to linear only for a small amount of data.

Tip: don't forget that quicksort behaves very well on random data and that it's not an O(n*log(n)) algorithm.

Nick D 2009-06-07 19:36:47

All logarithms are the same, they differ only in scale. So I don't see the significance of your first statement. Also, I don't agree with your statement that O(n log n) only similar to linear for a small amount of data. Once again, it's a scaling thing. As a counter-example, just look at the graphs in the original question.

waxwing 2009-06-07 19:54:23

I don't mean graphically similar (to a straight line) but time-complexity similar. O(n*logn) time can easily be an order of magnitude bigger than O(n). If the graphs compared O(n*logn) and O(n) algorithms you would see what I mean. :) As the N goes bigger and bigger the O(n*logn) *moves* to next logarithimic scales.

Nick D 2009-06-07 20:34:27

On average Quicksort IS an O(n log n) algorithm.

Manu 2010-08-12 13:05:04

Answer 5

+4 A:

Jouni K. Seppänen 2009-06-08 05:35:50

Answer 6

A:

groundhog 2009-08-02 16:23:22

a) Without specifying, O() is usually used to denote the *expected* (average) complexity.b) O() notation doesn't include constant factors, so O(n) and O(2n) are the same. Since log(n) is nearly constant (for large numbers, compared to n), it can be said that O(n) and O(n log(n)) are nearly the same. You should have plotted: http://www.wolframalpha.com/input/?i=plot+7+x%2C+x+log+x+from+1+to+1500

Timmmm 2010-03-24 18:06:58

This is generally untrue. Big O notation typically denotes worst case asymptotic complexity, and it notates a function which bounds above the algorithms complexity. O(n) does not approximate O(nlogn), although for practical purposes O(nlogn) is relatively good and not much worse. The worst case performance of quick sort is certainly *not* an uncommon thing to encounter. Try doing a quicksort on the entries in the dictionary if you don't believe me.

groundhog 2010-04-08 01:39:53

ansaurus

tags:

views:

answers:

O(N log N) Complexity - Similar to linear?

related questions