views:

2404

answers:

22

Whenever I consider algorithms/data structures I tend to replace the log(N) parts by constants. Oh, I know log(N) diverges - but does it matter in real world applications?

log(infinity) < 100 for all practical purposes.

I am really curious for real world examples where this doesn't hold.

To clarify:

  • I understand O(f(N))
  • I am curious about real world examples where the asymptotic behaviour matters more than the constants of the actual performance.
  • If log(N) can be replaced by a constant it still can be replaced by a constant in O( N log N).

This question is for the sake of (a) entertainment and (b) to gather arguments to use if I run (again) into a controversy about the performance of a design.

+5  A: 

I think this is a pragmatic approach; O(logN) will never be more than 64. In practice, whenever terms get as 'small' as O(logN), you have to measure to see if the constant factors win out. See also

http://stackoverflow.com/questions/1424303/uses-of-ackermann-function/1424397#1424397

To quote myself from comments on another answer:

[Big-Oh] 'Analysis' only matters for factors that are at least O(N). For any smaller factor, big-oh analysis is useless and you must measure.

and

"With O(logN) your input size does matter." This is the whole point of the question. Of course it matters... in theory. The question the OP asks is, does it matter in practice? I contend that the answer is no, there is not, and never will be, a data set for which logN will grow so fast as to always be beaten a constant-time algorithm. Even for the largest practical dataset imaginable in the lifetimes of our grandchildren, a logN algorithm has a fair chance of beating a constant time algorithm - you must always measure.

EDIT

A good talk:

http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey

about halfway through, Rich discusses Clojure's hash tries, which are clearly O(logN), but the base of the logarithm is large and so the depth of the trie is at most 6 even if it contains 4 billion values. Here "6" is still an O(logN) value, but it is an incredibly small value, and so choosing to discard this awesome data structure because "I really need O(1)" is a foolish thing to do. This emphasizes how most of the other answers to this question are simply wrong from the perspective of the pragmatist who wants their algorithm to "run fast" and "scale well", regardless of what the "theory" says.

EDIT

See also

http://queue.acm.org/detail.cfm?id=1814327

which says

What good is an O(log2(n)) algorithm if those operations cause page faults and slow disk operations? For most relevant datasets an O(n) or even an O(n^2) algorithm, which avoids page faults, will run circles around it.

(but go read the article for context).

Brian
A clearer restatement of the original question.
Nick Johnson
I find it interesting that you propose a dataset that could potentially take the entirety of our grandchildren's lives to run, and you want to code it both ways (O(1) and O(logN)) and measure the times with test data. You know, instead of being pragmatic about it (like your answer suggests being) and just choosing the one that academically should fair better. If your algorithm is literally costing lives to run, wouldn't you rather have something more than a hunch to fall back on when people question why it didn't finish in time?
mrduclaw
I apologize if I was unclear, what I meant regarding grandchildren is that maybe today the biggest dataset you'll use is perhaps on the order of 10^9, and I can imagine 50 years from now it might be 10^20, or whatever, but even then my assertion still holds. Even for incredibly huge numbers, logN is still small enough that you can't make practical decisions between logN and 1 based on complexity theory.
Brian
I disagree completely. Our datasets continue to grow. What you are considering is that we might reach 10^20 "nodes" of information. We agree. Where we differ is that I think each "node" (or dataset on a perosn) will contain gigabytes of information. At this point, you're above logbase2 n = 64. It DOES make a difference as the datasets grow, and they continue to.
San Jacinto
+48  A: 

Big O notation tells you about how your algorithm changes with growing input. O(1) tells you it doesn't matter how much your input grows, the algorithm will always be just as fast. O(logn) says that the algorithm will be fast, but as your input grows it will take a little longer.

O(1) and O(logn) makes a big diference when you start to combine algorithms.

Take doing joins with indexes for example. If you could do a join in O(1) instead of O(logn) you would have huge performance gains. For example with O(1) you can join any amount of times and you still have O(1). But with O(logn) you need to multiply the operation count by logn each time.

For large inputs, if you had an algorithm that was O(n^2) already, you would much rather do an operation that was O(1) inside, and not O(logn) inside.

Also remember that Big-O of anything can have a constant overhead. Let's say that constant overhead is 1 million. With O(1) that constant overhead does not amplify the number of operations as much as O(logn) does.

Another point is that everyone thinks of O(logn) representing n elements of a tree data structure for example. But it could be anything including bytes in a file.

Brian R. Bondy
Of course there is a huge difference between O(N) and O(logN). The question is whether there is a practical difference between O(logN) and O(1).
Brian
@Brian: Agree I covered this in the 2nd paragraph just as you posted that comment.
Brian R. Bondy
No, you would not rather do O(1) rather than O(logN) inside the loop. You'd rather do whichever one is actually faster, which requires measuring. That's the whole point of the OP. You are completely missing the point.
Brian
@Brian: Thanks for the suggestion, I added "For large inputs" before that statement. The answer does not miss the point, but that extra disclaimer to that statement was needed. Thanks for the suggestion! :)
Brian R. Bondy
Measuring only tells you how fast your algorithm will run with *this* size input. It doesn't tell you how fast it'd perform if the input size doubled. big-O notation does. You can't replace one with the other. I think Brian R. Bondy understands the point just fine.
jalf
I'm not trying to suggest that you need qualification (e.g. 'for large inputs'), I'm trying to suggest that you are flat out wrong. :) In practice, an algorithm that takes logN steps will always outperform an algorithm that takes 100 steps, regardless of the input size (under the extremely reasonable assumption that the input size is never bigger than 2^64 elements).
Brian
@Brian: There always exists a dataset that can prove your last statement wrong. :)
Michael Foukarakis
I made no claim about "regardless of the input size", you added that in yourself. Big O notation tells you about how many operations you'll need as your input grows. With O(1) your input size doesn't matter. With O(logn) your input size does matter. Yes I completely agree that you can have a O(1) algorithm that is better than a O(logn) algorithm for all of the inputs that are related to your exact problem. But there is a big important difference.
Brian R. Bondy
@Michael Foukarakis: No there isn't because the last statement assumes the same constant overhead for both the O(1) and O(logn) alrogithm.
Brian R. Bondy
@Brian: Here is a scenario to consider: an algorithm that takes 100 * logn is O(logn). For n>10, an algorithm O(1) with a smaller overhead will outperform it. Also recall what Bondy said: combining two algorithms with different big-O notations makes it very significant to leaving O(logn) denoted as it is.
Paul Lammertsma
Well phrased, Bondy: "With O(1) your input size doesn't matter. With O(logn) your input size does matter." As for the answer to the primary question "does it matter in real world applications?": yes, it certainly does. Consider performing an O(logn) algorithm within an O(n) algormithm. The big-O notation is now O(nlogn).
Paul Lammertsma
"With O(logN) your input size does matter." This is the whole point of the question. Of course it matters... _in theory_. The question the OP asks is, does it matter _in practice_? I contend that the answer is no, there is not, and never will be, a data set for which logN will grow so fast as to always be beaten a constant-time algorithm. Even for the largest practical dataset imaginable in the lifetimes of our grandchildren, a logN algorithm has a fair chance of beating a constant time algorithm - you must always measure.
Brian
@Brian: Is there a practical difference between an algorithm that runs in `n` versus one that runs in `2*n`?
Paul Lammertsma
@Brian: We get it, you understand this point "for some inputs you can find a O(1) algorithm that will not perform as good as a O(logn) algorithm". In practice I'll continue to chose an O(1) algorithm over a O(logn) algorithm when the constant factor overhead is the same, and you can chose the O(logn) one.
Brian R. Bondy
You shouldn't choose based on 'analysis', you should choose by 'measurement'. 'Analysis' only matters for factors that are at least O(N). For any smaller factor, big-oh analysis is useless and you must measure. That is my thesis. I've said my bit, good night! :)
Brian
Measurement is only good for constant inputs that you will know before hand.
Brian R. Bondy
@Brian your entire thesis is about upholding the fundamentals of asymptotic analysis? Is this a respected school/advisor granting you this opportunity?
San Jacinto
@San Jacinto: Sorry, I don't understand the question.
Brian R. Bondy
@San Jacinto: I'm speaking from a practical sense, not academically.
Brian R. Bondy
@Brian-r.-Bondy sorry, I meant the OTHER brian. :)
San Jacinto
@San Jacinto: Oops sorry :)
Brian R. Bondy
Another reason to measure is that some algorithms just play nicer with how computers actually work (cache-hits, branch prediction) -- https://www.cs.tcd.ie/publications/tech-reports/reports.05/TCD-CS-2005-57.pdf
Lou Franco
@San Jacinto: not 'graduate thesis', just 'thesis' in the sense of 'my central point that I am trying to communicate' on this question.
Brian
@Brian: I find it totally bizarre that you think that O(log n) is negligible for practical input sized. Binary search is O(log n). Variable usage is O(1). If you need some value several times, would you apply binary search for it each time, or would you stick it in a variable? **Do you need to measure before answering?** ... If N becomes sufficiently large, O(1) will always win out in the end. Saying that your inputs will "never" get large enough for that to matter is no different than saying *640k should be enough for anybody!*
Adam Bellaire
Just to clarify Adam means the other Brian in this conversation thread.
Brian R. Bondy
+5  A: 

Equality, the way you're describing it, is a common abuse of notation.

To clarify: we usually write f(x) = O(logN) to imply "f(x) is O(logN)".

At any rate, O(1) means a constant number of steps/time (as an upper bound) to perform an action regardless of how large the input set is. But for O(logN), number of steps/time still grows as a function of the input size (the logarithm of it), it just grows very slowly. For most real world applications you may be safe in assuming that this number of steps will not exceed 100, however I'd bet there are multiple examples of datasets large enough to mark your statement both dangerous and void (packet traces, environmental measurements, and many more).

Michael Foukarakis
How do you figure big O notation isn't used for practical purposes? I have used it a few times directly, many times indirectly as a guide, and I have seen others make silly mistakes because they didn't understand it.
Draemon
I'm sorry but that's a very wrong statement. Big O is very much used for practical purposes, it's a very important way to gauge the scalability of 2 different algorithms. But I do agree, the OP is a very common abuse.
Falaina
I use it as well, but it only describes the asymptotic behaviour of a function, there's still a lot of practical (read: implementation-defined) factors to consider when making a statement like the OP did.
Michael Foukarakis
Maybe you should rephrase your answer a bit then. I see what you mean, but it's a bit misleading to say that it's "not used for practical purposes"
jalf
I can see how it can be misunderstood. Took it out and added some clarifications for the OP.
Michael Foukarakis
I am actually asking for those examples that you bet exist. I do believe so, too, but well that's the 'I'm curious' part. And btw. it's "F(x) in O(log x)" :-P
phoku
There's no way to know unless we're talking about a specific problem that has two solutions with the specified complexities, and so on. But it's not hard to believe, if you know the math - just generate/collect data until your dataset is large enough..
Michael Foukarakis
A: 

O(log N) can be misleading. Take for example the operations on Red-Black trees.
The operations are O(logN) but rather complex, which means many low level operations.

Nick D
+18  A: 

This is a common mistake - remember Big O notation is NOT telling you about the absolute performance of an algorithm at a given value, it's simply telling you the behavior of an algorithm as you increase the size of the input.

When you take it in that context it becomes clear why an algorithm A ~ O(logN) and an algorithm B ~ O(1) algorithm are different:

if I run A on an input of size a, then on an input of size 1000000*a, I can expect the second input to take log(1,000,000) times as long as the first input

if I run B on an input of size a, then on an input of size 1000000*a, I can expect the second input to take about the same amount of time as the first input

EDIT: Thinking over your question some more, I do think there's some wisdom to be had in it. While I would never say it's correct to say O(lgN) == O(1), It IS possible that an O(lgN) algorithm might be used over an O(1) algorithm. This draws back to the point about absolute performance above: Just knowing one algorithm is O(1) and another algorithm is O(lgN) is NOT enough to declare you should use the O(1) over the O(lgN), it's certainly possible given your range of possible inputs an O(lgN) might serve you best.

Falaina
What he's saying (if i understood correctly) is that you'd need a considerably larger than "1000 000 * a" input in order to take even 100 times as much as the "a" input. log(1000000) = 6, so if you grow the input 1000 000 times, you'd only have a run time that is 6 times slower
laura
RIght, I realized what he was saying afterwards. It all comes down to whether or not you'd care about that lg(N) speed factor. I guess the claim could be that who cares about a factor of lg(N) difference, but that comes down to the performance requirements of the application.
Falaina
At best, the OP is cautioning against blind belief that an O(1) algorithm is always faster than O(log(n)); but come on, anyone who actually learnt big-O notation at school should remember the caveats.
Calyth
+3  A: 

What do you mean by whether or not it "matters"?

If you're faced with the choice of an O(1) algorithm and a O(lg n) one, then you should not assume they're equal. You should choose the constant-time one. Why wouldn't you?

And if no constant-time algorithm exists, then the logarithmic-time one is usually the best you can get. Again, does it then matter? You just have to take the fastest you can find.

Can you give me a situation where you'd gain anything by defining the two as equal? At best, it'd make no difference, and at worst, you'd hide some real scalability characteristics. Because usually, a constant-time algorithm will be faster than a logarithmic one.

Even if, as you say, lg(n) < 100 for all practical purposes, that's still a factor 100 on top of your other overhead. If I call your function, N times, then it starts to matter whether your function runs logarithmic time or constant, because the total complexity is then O(n lg n) or O(n).

So rather than asking if "it matters" that you assume logarithmic complexity to be constant in "the real world", I'd ask if there's any point in doing that.

Often you can assume that logarithmic algorithms are fast enough, but what do you gain by considering them constant?

jalf
Of course it can matter - the O(log N) algorithm might be simpler, easier to maintain and faster to implement.
phoku
@phoku: No one is arguing that you can find some input cases where an O(logn) algorithm will be faster than an O(1) algorithm. But just that in general you should chose a O(1) algorithm when all else is the same. Everyone is so caught up with that first line in this comment that they overlook that there is no reason that the O(1) algorithm would have a much bigger constant overhead than the O(logn) algorithm. –
Brian R. Bondy
@phoku: Then it matters whether the O(log N) algorithm is *efficient enough*. It doesn't matter whether it is constant time. It matters whether it is fast enough to be usable.
jalf
+2  A: 

O(logN)O(logN)O(logN) is very different. O(1) * O(1) * O(1) is still constant. Also a simple quicksort-style O(nlogn) is different than O(n O(1))=O(n). Try sorting 1000 and 1000000 elements. The latter isn't 1000 times slower, it's 2000 times, because log(n^2)=2log(n)

O(logN) is exactly the same as O(log(N^c)).
Michael Foukarakis
O(logN) is the same as O(log(N^c)), but O(log<sup>2</sup>N) is not.
Olexiy
+1  A: 

Yes, log(N) < 100 for most practical purposes, and No, you can not always replace it by constant.

For example, this may lead to serious errors in estimating performance of your program. If O(N) program processed array of 1000 elements in 1 ms, then you are sure it will process 106 elements in 1 second (or so). If, though, the program is O(N*logN), then it will take it ~2 secs to process 106 elements. This difference may be crucial - for example, you may think you've got enough server power because you get 3000 requests per hour and you think your server can handle up to 3600.

Another example. Imagine you have function f() working in O(logN), and on each iteration calling function g(), which works in O(logN) as well. Then, if you replace both logs by constants, you think that your program works in constant time. Reality will be cruel though - two logs may give you up to 100*100 multiplicator.

Olexiy
Thanks. Great example with reasonable values.
phoku
Please define a "practical" purpose. Your "practical" purpose is a lot different than my friends' "practical" purpose in Biological research at University.
San Jacinto
Ok-ok, reworded a bit.
Olexiy
BTW - pure log(N) time assumes some preprocessing, and, thus, can not work with such huge amounts of data (does humanity ever produced enough hard drives to store 2^100 bits?)
Olexiy
+3  A: 

For small enough N, O(N^N) can in practice be replaced with 1. Not O(1) (by definition), but for N=2 you can see it as one operation with 4 parts, or a constant-time operation.

What if all operations take 1hour? The difference between O(log N) and O(1) is then large, even with small N.

Or if you need to run the algorithm ten million times? Ok, that took 30minutes, so when I run it on a dataset a hundred times as large it should still take 30minutes because O(logN) is "the same" as O(1).... eh...what?

Your statement that "I understand O(f(N))" is clearly false.

Real world applications, oh... I don't know.... EVERY USE OF O()-notation EVER?

Binary search in sorted list of 10 million items for example. It's the very REASON we use hash tables when the data gets big enough. If you think O(logN) is the same as O(1), then why would you EVER use a hash instead of a binary tree?

Thomas
Fair enough: Consider C = Number of instructions such that the execution time is larger then the estimated age of the universe. Any algorithm with such a runtime is in O(1). An algorithm running in O(exp(N)) with a small (enough) constant is better in the sense that there exists an N such that the algorithm will finish before I die
phoku
@phoku this only works for this particular input. in this case, you might as well just hard-code the instructions necessary and achieve an O(1) algorithm. i'm not sure what you're trying to prove here. when you examine your potential input size, you'll know whether to choose an algorithm with high constants or a log(n) algorithm.
San Jacinto
@phoku: right, but we don't *always* use hast tables instead of binary trees either. A list of 10 elements is pretty much always searched faster than a hashtable lookup. A hashtable is O(1) (amortized) but with a more expensive operation than a normal binary search. Where the breaking point is depends on your data.
Thomas
@phoku: To clarify: I only answered your third sentence. Your second sentence seems like nonsense. Just because you have an unfathomable long (but finite) time to do something doesn't mean you can accomplish anything and everything in that time, no matter what the input size. You'll have to define C as "the set of instructions that when run will solve everything", which is provably false (see halting problem).
Thomas
+1  A: 

As others have pointed out, Big-O tells you about how the performance of your problem scales. Trust me - it matters. I have encountered several times algorithms that were just terrible and failed to meet the customers demands because they were too slow. Understanding the difference and finding an O(1) solution is a lot of times a huge improvement.

However, of course, that is not the whole story - for instance, you may notice that quicksort algorithms will always switch to insertion sort for small elements (Wikipedia says 8 - 20) because of the behaviour of both algorithms on small datasets.

So it's a matter of understanding what tradeoffs you will be doing which involves a thorough understanding of the problem, the architecture, & experience to understand which to use, and how to adjust the constants involved.

No one is saying that O(1) is always better than O(log N). However, I can guarantee you that an O(1) algorithm will also scale way better, so even if you make incorrect assumptions about how many users will be on the system, or the size of the data to process, it won't matter to the algorithm.

Vitali
+5  A: 

You asked for a real-world example. I'll give you one. Computational biology. One strand of DNA encoded in ASCII is somewhere on the level of gigabytes in space. A typical database will obviously have many thousands of such strands.

Now, in the case of an indexing/searching algorithm, that log(n) multiple makes a large difference when coupled with constants. The reason why? This is one of the applications where the size of your input is astronomical. Additionally, the input size will always continue to grow.

Admittedly, these type of problems are rare. There are only so many applications this large. In those circumstances, though... it makes a world of difference.

San Jacinto
Thanks for the example. However that's still below 100 even using basis 2.
phoku
I'm not sure what difference that makes. If you have constructed an algorithm with low OR high constants, this log(n) multiplier makes a great difference. I don't understand why 100 is the magic number. If it takes 10 minutes to make one pass of the innermost parts of the algorithm, why does 16*10 minutes seem just as innocuous as 4*10 minutes? It will take another 2 hours to run!
San Jacinto
+1  A: 

The rules of determining the Big-O notation are simpler when you don't decide that O(log n) = O(1).

As krzysio said, you may accumulate O(log n)s and then they would make a very noticeable difference. Imagine you do a binary search: O(log n) comparisons, and then imagine that each comparison's complexity O(log n). If you neglect both you get O(1) instead of O(log2n). Similarly you may somehow arrive at O(log10n) and then you'll notice a big difference for not too large "n"s.

yairchu
+1  A: 

The title of the question is misleading (well chosen to drum up debate, mind you).

O(log N) == O(1) is obviously wrong (and the poster is aware of this). Big O notation, by definition, regards asymptotic analysis. When you see O(N), N is taken to approach infinity. If N is assigned a constant, it's not Big O.

Note, this isn't just a nitpicky detail that only theoretical computer scientists need to care about. All of the arithmetic used to determine the O function for an algorithm relies on it. When you publish the O function for your algorithm, you might be omitting a lot of information about it's performance.

Big O analysis is cool, because it lets you compare algorithms without getting bogged down in platform specific issues (word sizes, instructions per operation, memory speed versus disk speed). When N goes to infinity, those issues disappear. But when N is 10000, 1000, 100, those issues, along with all of the other constants that we left out of the O function, start to matter.

To answer the question of the poster: O(log N) != O(1), and you're right, algorithms with O(1) are sometimes not much better than algorithms with O(log N), depending on the size of the input, and all of those internal constants that got omitted during Big O analysis.

If you know you're going to be cranking up N, then use Big O analysis. If you're not, then you'll need some empirical tests.

Scott A Miller
A: 

I do not believe algorithms where you can freely choose between O(1) with a large constant and O(logN) really exists. If there is N elements to work with at the beginning, it is just plain impossible to make it O(1), the only thing that is possible is move your N to some other part of your code.

What I try to say is that in all real cases I know off you have some space/time tradeoff, or some pre-treatment such as compiling data to a more efficient form.

That is, you do not really go O(1), you just move the N part elsewhere. Either you exchange performance of some part of your code with some memory amount either you exchange performance of one part of your algorithm with another one. To stay sane you should always look at the larger picture.

My point is that if you have N items they can't disappear. In other words you can choose between inefficient O(n^2) algorithms or worse and O(n.logN) : it's a real choice. But you never really go O(1).

What I try to point out is that for every problem and initial data state there is a 'best' algorithm. You can do worse but never better. With some experience you can have a good guessing of what is this intrisic complexity. Then if your overall treatment match that complexity you know you have something. You won't be able to reduce that complexity, but only to move it around.

If problem is O(n) it won't become O(logN) or O(1), you'll merely add some pre-treatment such that the overall complexity is unchanged or worse, and potentially a later step will be improved. Say you want the smaller element of an array, you can search in O(N) or sort the array using any common O(NLogN) sort treatment then have the first using O(1).

Is it a good idea to do that casually ? Only if your problem asked also for second, third, etc. elements. Then your initial problem was truly O(NLogN), not O(N).

And it's not the same if you wait ten times or twenty times longer for your result because you simplified saying O(1) = O(LogN).

I'm waiting for a counter-example ;-) that is any real case where you have choice between O(1) and O(LogN) and where every O(LogN) step won't compare to the O(1). All you can do is take a worse algorithm instead of the natural one or move some heavy treatment to some other part of the larger pictures (pre-computing results, using storage space, etc.)

kriss
Well, there's a trivial counterexample: something like "Return the first element of an array." You may be given N elements, but you only need to look at one of them.If you have to look at all n elements, your algorithm has a lower bound of O(n), but you may have optimizable parts of your algorithm. For instance, I could write an O(log n) algorithm that calculates the first element by using a binary search on the index of the item I'm looking at (or something silly like that). It may not slow my algorithm as a whole, but it slows down that part, even if the whole thing is O(n) or more.
DivineWolfwood
+1  A: 

In theory

Yes, in practical situations log(n) is bounded by a constant, we'll say 100. However, replacing log(n) by 100 in situations where it's correct is still throwing away information, making the upper bound on operations that you have calculated looser and less useful. Replacing an O(log(n)) by an O(1) in your analysis could result in your large n case performing 100 times worse than you expected based on your small n case. Your theoretical analysis could have been more accurate and could have predicted an issue before you'd built the system.

I would argue that the practical purpose of big-O analysis is to try and predict the execution time of your algorithm as early as possible. You can make your analysis easier by crossing out the log(n) terms, but then you've reduced the predictive power of the estimate.

In practice

If you read the original papers by Larry Page and Sergey Brin on the Google architecture, they talk about using hash tables for everything to ensure that e.g. the lookup of a cached web page only takes one hard-disk seek. If you used B-tree indices to lookup you might need four or five hard-disk seeks to do an uncached lookup [*]. Quadrupling your disk requirements on your cached web page storage is worth caring about from a business perspective, and predictable if you don't cast out all the O(log(n)) terms.

P.S. Sorry for using Google as an example, they're like Hitler in the computer science version of Godwin's law.

[*] Assuming 4KB reads from disk, 100bn web pages in the index, ~ 16 bytes per key in a B-tree node.

Alex
A: 

Assume that in your entire application, one algorithm accounts for 90% of the time the user waits for the most common operation.

Suppose in real time the O(1) operation takes a second on your architecture, and the O(logN) operation is basically .5 seconds * log(N). Well, at this point I'd really like to draw you a graph with an arrow at the intersection of the curve and the line, saying, "It matters right here." You want to use the log(N) op for small datasets and the O(1) op for large datasets, in such a scenario.

Big-O notation and performance optimization is an academic exercise rather than delivering real value to the user for operations that are already cheap, but if it's an expensive operation on a critical path, then you bet it matters!

Chris Moschini
+3  A: 

As many have already said, for the real world, you need to look at the constant factors first, before even worrying about factors of O(log N).

Then, consider what you will expect N to be. If you have good reason to think that N<10, you can use a linear search instead of a binary one. That's O(N) instead of O(log N), which according to your lights would be significant -- but a linear search that moves found elements to the front may well outperform a more complicated balanced tree, depending on the application.

On the other hand, note that, even if log N is not likely to exceed 50, a performance factor of 10 is really huge -- if you're compute-bound, a factor like that can easily make or break your application. If that's not enough for you, you'll frequently see factors of (log N)^2 or (logN)^3 in algorithms, so even if you think you can ignore one factor of (log N), that doesn't mean you can ignore more of them.

Finally, note that the simplex algorithm for linear programming has a worst case performance of O(2^n). However, for practical problems, the worst case never comes up; in practice, the simplex algorithm is fast, relatively simple, and consequently very popular.

About 30 years ago, someone developed a polynomial-time algorithm for linear programming, but it was not initially practical because the result was too slow.

Nowadays, there are practical alternative algorithms for linear programming (with polynomial-time wost-case, for what that's worth), which can outperform the simplex method in practice. But, depending on the problem, the simplex method is still competitive.

comingstorm
+1  A: 

For any algorithm that can take inputs of different sizes N, the number of operations it takes is upper-bounded by some function f(N).

All big-O tells you is the shape of that function.

  • O(1) means there is some number A such that f(N) < A for large N.

  • O(N) means there is some A such that f(N) < AN for large N.

  • O(N^2) means there is some A such that f(N) < AN^2 for large N.

  • O(log(N)) means there is some A such that f(N) < AlogN for large N.

Big-O says nothing about how big A is (i.e. how fast the algorithm is), or where these functions cross each other. It only says that when you are comparing two algorithms, if their big-Os differ, then there is a value of N (which may be small or it may be very large) where one algorithm will start to outperform the other.

Mike Dunlavey
+3  A: 

You might be interested in Soft-O, which ignores logarithmic cost. Check this paragraph in Wikipedia.

sdcvvc
A: 

Let's say you use an image-processing algorithm that runs in O(log N), where N is the number of images. Now... stating that it runs in constant time would make one believe that no matter how many images there are, it would still complete its task it about the same amount of time. If running the algorithm on a single image would hypothetically take a whole day, and assuming that O(logN) will never be more than 100... imagine the surprise of that person that would try to run the algorithm on a very large image database - he would expect it to be done in a day or so... yet it'll take months for it to finish.

luvieere
+2  A: 

The observation that O(log n) is oftentimes indistinguishable from O(1) is a good one.

As a familiar example, suppose we wanted to find a single element in a sorted array of one 1,000,000,000,000 elements:

  • with linear search, the search takes on average 500,000,000,000 steps
  • with binary search, the search takes on average 40 steps

Suppose we added a single element to the array we are searching, and now we must search for another element:

  • with linear search, the search takes on average 500,000,000,001 steps (indistinguishable change)
  • with binary search, the search takes on average 40 steps (indistinguishable change)

Suppose we doubled the number of elements in the array we are searching, and now we must search for another element:

  • with linear search, the search takes on average 1,000,000,000,000 steps (extraordinarily noticeable change)
  • with binary search, the search takes on average 41 steps (indistinguishable change)

As we can see from this example, for all intents and purposes, an O(log n) algorithm like binary search is oftentimes indistinguishable from an O(1) algorithm like omniscience.

The takeaway point is this: *we use O(log n) algorithms because they are often indistinguishable from constant time, and because they often perform phenomenally better than linear time algorithms.

Obviously, these examples assume reasonable constants. Obviously, these are generic observations and do not apply to all cases. Obviously, these points apply at the asymptotic end of the curve, not the n=3 end.

But this observation explains why, for example, we use such techniques as tuning a query to do an index seek rather than a table scan - because an index seek operates in nearly constant time no matter the size of the dataset, while a table scan is crushingly slow on sufficiently large datasets. Index seek is O(log n).

Justice
A: 

you are right, in many cases it does not matter for pracitcal purposes. but the key question is "how fast GROWS N". most algorithms we know of take the size of the input, so it grows linearily.

but some algorithms have the value of N derived in a complex way. if N is "the number of possible lottery combinations for a lottery with X distinct numbers" it suddenly matters if your algorithm is O(1) or O(logN)

Andreas Petersson