tags:

views:

421

answers:

8
+2  Q: 

What's my Big O?

My program of sorting values clocks at:

  • 100000 8s
  • 1000000 82s
  • 10000000 811s

Is that O(n)?

+2  A: 

it looks like O(n) to me.

John Boker
+8  A: 

Yes, that looks like O(n) to me - going from the 1st to 2nd case and the 2nd to 3rd case, you've made the input 10 times bigger, and it's taken 10 times longer.

In particular, it looks like you can predict the rough time by using:

f(n) = n / 12500

or

f(n) = n * 0.00008

which gives a simplest explanation of O(n) for the data provided.

EDIT: However... As has been pointed out, there are various ways in which the data could be misleading you - I rather like Dennis Palmer's idea that the IO cost is dwarfing everything else. For example, suppose you have an algorithm whose absolute number of operations is:

f(n) = 1000000000000n + (n^2)

In that case the complexity is still O(n^2), but that won't become apparent from observations until n is very large.

I think it would be accurate to say that these observations are suggestive of an O(n) algorithm but that doesn't mean it definitely is.

Jon Skeet
Generally: you make the input *n* times bigger, it takes *n* times longer.
Stephan202
@Stephan202: Indeed - I was just going by the example numbers given.
Jon Skeet
(I've updated the answer to make that clearer though - thanks.)
Jon Skeet
@Downvoters: care to explain?
Jon Skeet
Yup: if the time is dominated enough by something like I/O, the algorithm could be bogosort (which is O(n!n) for average performance).
David Thornley
I didn't downvote. However, I didn't upvote either because I fear you may be feeding his delusion that one can derive the time behavior of an algorithim from three data points. Perhaps that had something to do with it?
T.E.D.
@David: In theory, that is quite correct. However having implemented a Bogosort (for a school project...honest) I'd be really surprised if that was the algorithm. That puppy goes exponential *fast*.
T.E.D.
+17  A: 

Looks like it, but in reality, you really need to analyze the algorithm, because there may be different cases based on the data. Some algorithms do better or worse with pre-sorted data, for instance. What's your algorithm?

nojo
It could be a radix sort, of course - but I agree, a general purpose O(n) sort is somewhat suspicious :)
Jon Skeet
+1 Landau notation applies to *algorithms* **not** *programs*.
THC4k
@nojo: I use std::sort (introsort?) with random values (stdlib::rand()) in the span 0-10000. I guess those facts would have to go as presumtions for the O(n) of these samples?
sharkin
Hmm - I would have guessed that you weren't using random input - I'm not sure that O(n) is possible for a sort of random input (like Jon Skeet said), though your results certainly look like it. Digging deeper into Jon Skeet's comment, radix sort can appear to be O(n) (http://en.wikipedia.org/wiki/Radix_sort), so that might be what you're seeing.
nojo
It still could be that the element size is small enough that the n log n portion of its calculation hasn't taken over yet.
T.E.D.
@nojo: I also find it strange getting O(n) when none of the proposed sorting algorithms have O(n) generally. However, could it be that due to the span of values being less than the _number_ of values and considering that rand() may generate a lot of duplicates, that maybe the sorting algorithm's adaptive properties makes O(n) realistic?
sharkin
+2  A: 

Yes, that is O(n) because it scales with the number of items.

1000000 = 10 * 100000

and

82s = 10 * 8s (roughly)
Alex B
+8  A: 

Time behavior doesn't work that way. All you can really say there is that those three datasets are roughly O(n) from each other. That doesn't mean the algorithim is O(n).

The first problem is that I could easily draw a curve that goes exponential ( O(e**n) ) that nonetheless includes those three points.

The big problem though is that you didn't say anything about the data. There are many sorting algorithms that approach O(n) for sorted or nearly sorted inputs (eg: Mergesort). However, their average case (typically randomly-ordered data) and worst case (often reverse-sorted data) is invariably O(nlogn) or worse.

T.E.D.
+1  A: 

Yes, it appears to be O(n), but I don't think you can conclusively analyze the algorithm based on it's timed performance. You might be using the most inefficient sorting algorithm, but have O(n) timed results because the data read/write time is the majority of the total execution time.

Edit: Big-O is determined by how efficient an algorithm is and how it scales. It should predict the growth of execution time as the number of input items grows. The inverse is not necessarily true. Observing a given growth in execution time could mean a few different things. If it stays true to the example numbers you've given, then you can conclude that your program runs at O(n), but as others have said, that doesn't mean your sorting algorithm is O(n).

Dennis Palmer
+2  A: 

You cannot depend on that to say it is O(n). Bubblesort, for instance, can complete in n steps, however it is an O(n^2) algorithm.

Matthew Sowders
+4  A: 

You cannot tell.

First, the time depends on the data, environment and algorithm. If you have an array of zeroes and try to sort it, the program running time shouldn't be much different for 1000 or 1000000 elements.

Second, the O notation tells about function behavior for big value of n, starting at n0. It could be that your algorithm scales well up to certain input size, and then its behavior changes - like the g(n) function.

alt text

ya23