views:

81999

answers:

12

What is a plain English explanation of Big O? With as little formal definition as possible and simple mathematics.

+7  A: 

Big O is just a way to "Express" yourself in a common way, "How much time / space does it take to run my code?".

You may often see O(n), O(n^2), O(nlogn) and so forth, all these are just ways to show; How does an algorithm change?

O(n) means Big O is n, and now you might think, "What is n!?" Well "n" is the amount of elements. Imaging you want to search for an Item in an Array. You would have to look on Each element and as "Are you the correct element/item?" in the worst case, the item is at the last index, which means that it took as much time as there are items in the list, so to be generic, we say "oh hey, n is a fair given amount of values!".

So then you might understand what "n^2" means, but to be even more specific, play with the thought you have a simple, the simpliest of the sorting algorithms; bubblesort. This algorithm needs to look through the whole list, for each item.

My list

  1. 1
  2. 6
  3. 3

The flow here would be:

  • Compare 1 and 6, which is biggest? Ok 6 is in the right position, moving forward!
  • Compare 6 and 3, oh, 3 is less! Let's move that, Ok the list changed, we need to start from the begining now!

This is O n^2 because, you need to look at all items in the list there are "n" items. For each item, you look at all items once more, for comparing, this is also "n", so for every item, you look "n" times meaning n*n = n^2

I hope this is as simple as you want it.

But remember, Big O is just a way to experss yourself in the manner of time and space.

Filip Ekberg
This is a clear and good answer. Thank you ;)
Gonzalo Quero
Glad to be of help :)
Filip Ekberg
+27  A: 
Jon Skeet
A hashtable requires an algorithm to run to calculate the index of the actual array ( depending on the implementation ). And an array just have O(1) because it's just an adress. But this has nothing to do with the question, just an observation :)
Filip Ekberg
jon's explanation has very much todo with the question i think. it's exactly how one could explain it to some mum, and she would eventually understand it i think :) i like the clothes example (in particular the last, where it explains the exponential growth of complexity)
Johannes Schaub - litb
Oh i don't mean the whole answer, just the hashtable lookup and that it can, actually, Never be as fast as a direct adressing :)
Filip Ekberg
Filip: I'm not talking about address an array by index, I'm talking about finding a matching entry in an array. Could you reread the answer and see if that's still unclear?
Jon Skeet
Ekberg, oh i'm sorry. you're right you didn't say anything about the answer as a whole. next time i read your comment twice, mate :)
Johannes Schaub - litb
that's my bro Jon Skeet
HollerTrain
+4  A: 

Summary: The upper bound of the complexity of an algorithm.

See also the similar question Big O, how do you calculate/approximate it? for a good explaination.

Kosi2801
Thats not... simple english! :P
Filip Ekberg
@Filip Ekberg: upper? bound? complexity? algorithm? Seem simple enough to me. +1
S.Lott
Wow! That link explained it very well and in a simple manner. You should have gotten more up votes.
kirk.burleson
+812  A: 

The simplest definition I can give for Big-O notation is this:

Big-O notation is a relative representation of the complexity of an algorithm.

There are some important and deliberately chosen words in that sentence:

  • relative: you can only compare apples to apples. You can't compare an algorithm to do arithmetic multiplication to an algorithm that sorts a list of integers. But two algorithms that do arithmetic operations (one multiplication, one addition) will tell you something meaningful;
  • representation: Big-O (in its simplest form) reduces the comparison between algorithms to a single variable. That variable is chosen based on observations or assumptions. For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering). This assumes that comparison is expensive. But what if comparison is cheap but swapping is expensive? It changes the comparison; and
  • complexity: if it takes me one second to sort 10,000 elements how long will it take me to sort one million? Complexity in this instance is a relative measure to something else.

Come back and reread the above when you've read the rest.

The best example of Big-O I can think of is doing arithmetic. Take two numbers (123456 and 789012). The basic arithmetic operations we learnt in school were:

  • addition;
  • subtraction;
  • multiplication; and
  • division.

Each of these is an operation or a problem. A method of solving these is called an algorithm.

Addition is the simplest. You line the numbers up (to the right) and add the digits in a column writing the last number of that addition in the result. The 'tens' part of that number is carried over to the next column.

Let's assume that the addition of these numbers is the most expensive operation in this algorithm. It stands to reason that to add these two numbers together we have to add together 6 digits (and possibly carry a 7th). If we add two 100 digit numbers together we have to do 100 additions. If we add two 10,000 digit numbers we have to do 10,000 additions.

See the pattern? The complexity (being the number of operations) is directly proportional to the number of digits n in the larger number. We call this O(n) or linear complexity.

Subtraction is similar (except you may need to borrow instead of carry).

Multiplication is different. You line the numbers up, take the first digit in the bottom number and multiply it in turn against each digit in the top number and so on through each digit. So to multiply our two 6 digit numbers we must do 36 multiplications. We may need to do as many as 10 or 11 column adds to get the end result too.

If we have 2 100 digit numbers we need to do 10,000 multiplications and 200 adds. For two one million digit numbers we need to do one trillion (1012) multiplications and two million adds.

As the algorithm scales with n-squared, this is O(n2) or quadratic complexity. This is a good time to introduce another important concept:

We only care about the most significant portion of complexity.

The astute may have realized that we could express the number of operations as: n2 + 2n. But as you saw from our example with two numbers of a million digits apiece, the second term (2n) becomes insignificant (accounting for 0.00002% of the total operations by that stage).

The Telephone Book

The next best example I can think of is the telephone book, normally called the White Pages or similar but it'll vary from country to country. But I'm talking about the one that lists people by surname and then initials or first name, possibly address and then telephone numbers.

Now if you were instructing a computer to look up the phone number for "John Smith", what would you do? Ignoring the fact that you could guess how far in the S's started (let's assume you can't), what would you do?

A typical implementation might be to open up to the middle, take the 500,000th and compare it to "Smith". If it happens to be "Smith, John", we just got real lucky. Far more likely is that "John Smith" will be before or after that name. If it's after we then divide the last half of the phone book in half and repeat. If it's before then we divide the first half of the phone book in half and repeat. And so on.

This is called a bisection search and is used every day in programming whether you realize it or not.

So if you want to find a name in a phone book of a million names you can actually find any name by doing this at most 21 or so times (I might be off by 1). In comparing search algorithms we decide that this comparison is our 'n'.

For a phone book of 3 names it takes 2 comparisons (at most).
For 7 it takes at most 3.
For 15 it takes 4.
...
For 1,000,000 it takes 21 or so.

That is staggeringly good isn't it?

In Big-O terms this is O(log n) or logarithmic complexity. Now the logarithm in question could be ln (base e), log10, log2 or some other base. It doesn't matter it's still O(log n) just like O(2n2) and O(100n2) are still both O(n2).

It's worthwhile at this point to explain that Big O can be used to determine three cases with an algorithm:

  • Best Case: In the telephone book search, the best case is that we find the name in one comparison. This is O(1) or constant complexity;
  • Expected Case: As discussed above this is O(log n); and
  • Worst Case: This is also O(log n).

Normally we don't care about the best case. We're interested in the expected and worst case. Sometimes one or the other of these will be more important.

Back to the telephone book.

What if you have a phone number and want to find a name? The police have a reverse phone book but such lookups are denied to the general public. Or are they? Technically you can reverse lookup a number in an ordinary phone book. How?

You start at the first name and compare the number. If it's a match, great, if not, you move on to the next. You have to do it this way because the phone book is unordered (by phone number anyway).

So to find a name:

  • Best Case: O(1);
  • Expected Case: O(n) (for 500,000); and
  • Worst Case: O(n) (for 1,000,000).

The Travelling Salesman

This is quite a famous problem in computer science and deserves a mention. In this problem you have N towns. Each of those towns is linked to 1 or more other towns by a road of a certain distance. The Travelling Salesman problem is to find the shortest tour that visits every town.

Sounds simple? Think again.

If you have 3 towns A, B and C with roads between all pairs then you could go:

A -> B -> C
A -> C -> B
B -> C -> A
B -> A -> C
c -> A -> B
C -> B -> A

Well actually there's less than that because some of these are equivalent (A -> B -> C and C -> B -> A are equivalent, for example, because they use the same roads, just in reverse).

In actuality there are 3 possibilities.

Take this to 4 towns and you have (iirc) 12 possibilities. With 5 it's 60. 6 becomes 360.

This is a function of a mathematical operation called a factorial. Basically:

5! = 5 * 4 * 3 * 2 * 1 = 120
6! = 6 * 5 * 4 * 3 * 2 * 1 = 720
7! = 7 * 6 * 5 * 4 * 3 * 2 * 1 = 5040
...
25! = 25 * 24 * ... * 2 * 1 = 15,511,210,043,330,985,984,000,000
...
50! = 50 * 49 * ... * 2 * 1 = 3.04140932... × 10^64 

So the Big-O of the Travelling Salesman problem is O(n!) or factorial or combinatorial complexity.

By the time you get to 200 towns there isn't enough time left in the universe to solve the problem with traditional computers.

Something to think about.

Polynomial Time

Another point I wanted to make quick mention of is that any algorithm that has a complexity of O(na) is said to have polynomial complexity or is solvable in polynomial time.

Traditional computers can solve polynomial-time problems. Certain things are used in the world because of this. Public Key Cryptography is a prime example. It is computationally hard to find two prime factors of a very large number. If it wasn't, we couldn't use the public key systems we use.

Anyway, that's it for my (hopefully plain English) explanation of Big O (revised).

cletus
While the other answers focus on explaining the differences between O(1), O(n^2) et al.... yours is the one which details how algorithms can get classified into n^2, nlog(n) etc. +1 for a good answer that helped me understand Big O notation as well
Yew Long
Heh someone downvoted this. Go figure.
cletus
I like the explanation. It is important to note that big O is about worst-case complexity, e.g. For sorting - for the sequence which requires the most operations, For multiplication - probably the largest possible input numbers, etc.
Yuval F
It's not about "worst-case" it's used to define best-, worst- and general-case.
Filip Ekberg
Please let me know what you think of this version. I fleshed it out a bit, hopefully it's as good or better.
cletus
Great answer, they should put this in Wikipedia instead of the unreadable textbook like definition they have, which assumes a lot of background knowledge. There's a couple of typos in the above
Arec Barrwin
A couple of other points here is that the complexity can be either time or space and that one can talk about the big-O of best, average, or worst case scenarios, or at least that is what I remember from school.
JB King
typos (i'll remove comment when edited): same raods, jus tin reverse
kajaco
...NP-complete...: That's not correct - a problem that is solvable in n^a time is said to be in P, polynomial time. A problem that is NP-complete means that if this problem is solvable in P time, then every NP problem is solvable in P time. NP-hard just means that it's harder than the hardest NP.
Paul Fisher
...so I'll downvote temporarily, then you'll get a +1 when that gets fixed :)
Paul Fisher
one might want to add that big-O represents an upper bound (given by an algorithm), big-Omega give a lower bound (usually given as a proof independent from a specific algorithm) and big-Theta means that an "optimal" algorithm reaching that lower bound is known.
mdm
*Excellent* answer.
Rob
This is the best answer on this site so far. awesome. +1000 if I could.
WolfmanDragon
Do you write documentation manuals? if you don't you should. Textbooks too while you are at it.
WolfmanDragon
By far the longest answer on SO that I have actually read to the end. That fact alone is +1. :-D And the answer is really good, which would be +1 again, if I could. Thanks for writing that up, cletus.
Tomalak
I think there is a mistake in the sentence "accounting for 0.00002% of the total operations by that stage". It should be 0.0002%, not 0.00002%.
Masi
Good writeup. Two suggestions: Mention that e.g. the Travelling Salesman can be approximated to within a proven factor of the minimal answer if e.g. it can be assumed that going directly from A to B is shorter than going A-C-B. Also mention that NP is polynomial (P) _IFF_ the computer magically picks the correct posibillity everytime it has to make a choice.
Thorbjørn Ravn Andersen
Absolutely awesome! Very well-explained, and absolutely understandable.
Galilyou
There is a typo in current revision (4) of this answer: "5! = 5 * 4 * 3 * 2 * 1 - 120" should be "5! = 5 * 4 * 3 * 2 * 1 = 120" (s/-/=/, if You understand sed)
Reef
Cletus, how do you get the time to write such long and well thought out answers ! amazing though ! thanks a lot for the explanation !
Preets
I stumbled upon this by accident, and I just have to tell you that this is the best answer to any question I've ever seen. You have just kicked the asses of every CS professor I've ever had.
Ian Henry
Epic answer. I wish this was there in my graduate course!
Raj
Wow, what an answer! Cletus, man, you should consider writing a book on Theory of Computation... a subject that is usually very boringly covered by textbooks. If you write, I will read it. If not a book, may be at least a series of blog posts.
Harry
This is good if you're looking for the longest answer, but not for the answer that best explains Big-O in a simple manner.
kirk.burleson
@Arec "we should put this in Wikipedia". FTFY.All we would need is for Daniel Trebbian to make it available under the appropriate copyright license.
audiodude
It's a perfectly good answer, by why does it take so much text to explain such a simple concept?
Inverse
Uh ... impressing! You're for hire?
Wow. I would vote this +10. Really great job.
Carlo
Can this answer be posted as the article on [Big O notation](http://simple.wikipedia.org/wiki/Big_O_notation) on simple.wikipedia? Does StackOverflow's CC-Wiki license allow it?
abhin4v
@Inverse: Big-O is the gateway to daunting CSey stuff for many programmers and I think the conversational style helps people ease into it. A bit of repetition actually helps people abstract out the patterns for themselves.
j_random_hacker
A: 

Big O is a measure of how much time/space an algorithm uses relative to the size of its input.

If an algorithm is O(n) then the time/space will increase at the same rate as its input.

If an algorithm is O(n^2) then the time/space increase at the rate of its input squared.

and so on.

Brownie
It's not about space. It's about complexity which means time.
S.Lott
I have always believed it can be about time OR space. but not about both at the same time.
Rocco
Complexity most definitely can be about space. Have a look at this: http://en.wikipedia.org/wiki/PSPACE
pelotom
+11  A: 

Big O describes an upper limit on the growth behaviour of a function, for example the runtime of a program, when inputs become large.

Examples:

  • O(n): If I double the input size the runtime doubles

  • O(n2): If the input size doubles the runtime quadruples

  • O(log n): If the input size doubles the runtime increases by one

  • O(2n): If the input size increases by one, the runtime doubles

The input size is usually the space in bits needed to represent the input.

starblue
incorrect! for example O(n): If I double the input size the runtime will multiply to finite non zero constant. I mean O(n) = O(n + n)
mmcteam.com.ua
I'm talking about the f in f(n) = O(g(n)), not the g as you seem to understand.
starblue
+3  A: 

The other answers are quite good, just one detail to understand it: O(log n) or similar means, that it depends on the "length" or "size" of the input, not on the value itself. This could be hard to understand, but is very important. For example, this happens when your algorithm is splitting things in two in each iteration.

Harald Schilly
+35  A: 

It shows how an algorithm scales.

O(n^2):

  • 1 item: 1 second
  • 10 items: 100 seconds
  • 100 items: 10000 seconds

Notice that the number of items increases by a factor of 10, but the time increases by a factor of 10^2. Basically, n=10 and so O(n^2) gives us the scaling factor n^2 which is 10^2.

O(n):

  • 1 item: 1 second
  • 10 items: 10 seconds
  • 100 items: 100 seconds

This time the number of items increases by a factor of 10, and so does the time. n=10 and so O(n)'s scaling factor is 10.

O(1):

  • 1 item: 1 second
  • 10 items: 1 second
  • 100 items: 1 second

The number of items is still increasing by a factor of 10, but the scaling factor of O(1) is always 1.

That's the gist of it. They reduce the maths down so it might not be exactly n^2 or whatever they say it is, but that'll be the dominating factor in the scaling.

Ray Hidayat
what does this definition mean exactly? (The number of items is still increasing by a factor of 10, but the scaling factor of O(1) is always 1.)
HollerTrain
Not seconds, operations. Also, you missed out on factorial and logarithmic time.
Chris Charabaruk
+3  A: 

Big O describes the fundamental scaling nature of an algorithm.

There is a lot of information that Big O does not tell you about a given algorithm. It cuts to the bone and gives only information about the scaling nature of an algorithm, specifically how the resource use (think time or memory) of an algorithm scales in response to the "input size".

Consider the difference between a steam engine and a rocket. They are not merely different varieties of the same thing (as, say, a Prius engine vs. a Lamborghini engine) but they are dramatically different kinds of propulsion systems, at their core. A steam engine may be faster than a toy rocket, but no steam piston engine will be able to achieve the speeds of an orbital launch vehicle. This is because these systems have different scaling characteristics with regards to the relation of fuel required ("resource usage") to reach a given speed ("input size").

Why is this so important? Because software deals with problems that may differ in size by factors up to a trillion. Consider that for a moment. The ratio between the speed necessary to travel to the Moon and human walking speed is less than 10,000:1, and that is absolutely tiny compared to the range in input sizes software may face. And because software may face an astronomical range in input sizes there is the potential for the Big O complexity of an algorithm, it's fundamental scaling nature, to trump any implementation details.

Consider the canonical sorting example. Bubble-sort is O(n^2) while merge-sort is O(n log n). Let's say you have two sorting applications, application A which uses bubble-sort and application B which uses merge-sort, and let's say that for input sizes of around 30 elements application A is 1,000x faster than application B at sorting. If you never have to sort much more than 30 elements then it's obvious that you should prefer application A, as it is much faster at these input sizes. However, if you find that you may have to sort ten million items then what you'd expect is that application B actually ends up being thousands of times faster than application A in this case, entirely due to the way each algorithm scales.

Wedge
A: 

Big O is an abstraction.

It means that it does not attempt to make any assumption about a particular platform or implementation.

As a consequence, Big O is not always pertinent because it completely ignores real-life issues that matter. A lot.

Like the poor implementation of a given programming language/library, or the poor implementation written by a programmer, the delay involved in CPU caches reloading, the cost of access to system memory, the cost of accessing disk-based data, or the cost of a given artithmetic operation as compared to another.

Conclusion: abstractions are a lot of fun, but in the end daydreaming is less useful than experimentation.

ALWAYS benchmark your code. Don't make assumptions (especially made on abstractions).

Pierre
-1, this talks about Big O but does not define it; the OP asked for a definition
Lord Torgamus
I would think your real problems wouldn't be relevant if you had problems with Big O.
ccook
You really should delete this answer.
kirk.burleson
If you delete an answer with -3 or less you get a badge, it's called "Peer pressure".
Andrei Rinea
+3  A: 

Big O notation is a way of describing the upper bound of an algorithm in terms of space or running time. The n is the number of elements in the the problem (i.e size of an array, number of nodes in a tree, etc.) We are interested in describing the running time as n gets big.

When we say some algorithm is O(f(n)) we are saying that the running time (or space required) by that algorithm is always lower than some constant times f(n).

To say that binary search has a running time of O(logn) is to say that there exists some constant c which you can multiply log(n) by that will always be larger than the running time of binary search. In this case you will always have some constant factor of log(n) comparisons.

In other words where g(n) is the running time of your algorithm, we say that g(n) = O(f(n)) when g(n) <= c*f(n) when n > k, where c and k are some constants.

white rabbit
+3  A: 

There is a lecture dedicated to complexity of the algorithms in the Lecture 8 of the MIT "Introduction to Computer Science and Programming" course http://www.youtube.com/watch?v=ewd7Lf2dr5Q

It is not completely plain English, but gives nice explanation with examples that are easily understandable.

ivanjovanovic