quick-sort has worst case time complexity of O(N^2) but it is usually considered better than other sorting algorithms which have O(N log n) time complexity in the worst case.
Simplex is an algorithm which has exponential time complexity in the worst case but for any real case it is polynomial. provably polynomial algorithms for linear programming exist but they are very complicated and usually have large constants.
Often an algorithm (like quicksort) that can be easily parallelized or randomized will be chosen over competing algorithms that lack these qualities. Furthermore, it is often the case that an approximate solution to a problem is acceptable when an exact algorithm would yield exponential runtimes as in the Travelling Salesman Problem.
This statement can be applied to nearly any parallel algorithm. The reason they were not heavily researched in the early days of computing is because, for a single thread of execution (think uniprocessor), they are indeed slower than their well-known sequential counterparts in terms of asymptotic complexity, constant factors for small n, or both. However, in the context of current and future computing platforms, an algorithm which can make use of a few (think multicore), few hundred (think GPU), or few thousand (think supercomputer) processing elements will beat the pants of the sequential version in wall-clock time, even if the total time/energy spent by all processors is much greater for the parallel version.
Sorts, graph algorithms, and linear algebra techniques alike can be accelerated in terms of wall-clock time by bearing the cost of a little extra bookkeeping, communication, and runtime overhead in order to parallelize.
"Worse is Better" can be seen in languages too, for example the ideas behind Perl, Python, Ruby, Php even C# or Java, or whatever language that isn't assembler or C (C++ might fit here or not).
Basically there is always a "perfect" solution, but many times its better to use a "worse" tool/algorithm/language to get results faster, and with less pain. Thats why people use these higher level languages, although they are "worse" from the ideal computer-language point of view, and instead are more human oriented.
I've always understood the term 'worse is better' to relate to problems with correct solutions that are very complex where an approximate (or good enough) solution exists that is relatively easier to comprehend.
This makes for easier design, production, and maintenance.
After a life time of windows, early Windows Vista forced me to switch to Apple.
Now people think I know nothing about computers, it's great. Two examples for you.
There's an O(n) algorithm for selecting the k-th largest element from an unsorted set, but it is rarely used instead of sorting, which is of course O(n logn).
This example would be the answer if there were no computers capable of storing these large collections.
Presumably the size of the collection was 641K.
When working in the technical computing group for BAE SYSTEMS, which looked after structural and aerodynamic code for various aircraft, we had a codebase going back at least 25 years (and a third of the staff had been there that long).
Many of the algorithms were optimised for performance on a 16bit mainframe, rather than for scalability. These optimisations were entirely appropriate for 1970s hardware, but performed poorly on larger datasets on the 32 and 64 bit systems which replaced it. If you're choosing something with worse scalability which works better on the hardware you are currently working on, be aware that this is an optimisation, and it may not apply in the future. At the time those 1970s routines were written, data size we put into them in the 2000s was not practical. Unfortunately, trying to extract a clear algorithm from those codes which then could be implemented to suit modern hardware was not trivial.
Short of boiling the oceans, what counts as 'all practical situations' is often a time dependent variable.
Monte Carlo integration is a probabilistic method of calculating definite integrals that has no guarantee of returning the correct answer. Yet, in real-world situations it returns an accurate answer far faster than provably correct methods.
Radix sort has time-complexity O(n) for fixed-length inputs, but quicksort is more often used, despite the worse asympotic runtime, because the per-element overhead on Radix sort is typically much higher.
Ok, consider solving the traveling sales man problem. The ONLY perfect solution is to test all possible routes. However that becomes impossible with our hardware and time-limits as N increases. So we have thought of many of heuristics.
Which brings us to the answer of your question. Heuristics (worse) are better than brute-force for NP-complete problems. This describes the situation in which "Worse is Better" is always true.
Mergesort versus Quicksort
Quick sort has an average time complexity of O(n log n). It can sort arrays in place, i.e. a space complexity of O(1).
Merge sort also has an average time complexity of O(n log n), however its space complexity is much worse: Θ(n). (there is a special case for linked lists)
Because of the worst case time complexity of quick sort is Θ(n^2) (i.e. all elements fall on the same side of every pivot), and mergesort's worst case is O(n log n), mergesort is the default choice for library implementers.
In this case, I think that the predictability of the mergesort's worst case time complexity trumps quicksorts much lower memory requirements.
Given that it is possible to vastly reduce the likelihood of the worst case of quicksort's time complexity (via random selection of the pivot for example), I think one could argue that mergesort is worse in all but the pathological case of quicksort.
Insertion sort despite having O(n2) complexity is faster for small collections (n < 10) than any other sorting algorithm. That's because the nested loop is small and executes fast. Many libraries (including STL) that have implementation of sort method actually using it for small subsets of data to speed things up.
When calculating the median of a group of numbers, you can use an algorithm very similar to quicksort. You partition around a number, and all the bigger ones go to one side, and all the smaller ones go the other side. Then you throw away one side and recursively calculate the median of the larger side. This takes O(n^2) in the worst case, but is pretty fast (O(n) with a low constant) in the average case.
You can get guaranteed worst-case O(n) performance, with a constant of about 40. This is called the median of medians algorithm. In practice, you would never use this.
Not quite on the mark, but backtracking-based regular expressions have an exponential worst case versus O(N) for DFA-based regular expressions, yet backtracking-based regular expressions are almost always used rather than DFA-based ones.
EDIT: (JFS)
The power that backreferences add comes at great cost: in the worst case, the best known implementations require exponential search algorithms.
This method (DFA) is really more efficient but it also have important drawbacks:
- You have no control on the way the expression return a match, it will always return the longuest-leftmost match, no matter how you have crafted your expression.
- There is no backtracking, so all repeating operators are greedy. Atomic grouping and possessive quantifiers are also non-senses.
- Lookarounds are impossibles
- Capturing and back-references are also impossible
- Regex pre-compilation is longer and takes more memory
Monte carlo integration was already suggested but a more specific example is Monte Carlo pricing in finance is also a suggestion. Here the method is much easier to code and can do more things than some others BUT it is much slower than say, finite difference.
its not practical to do 20dimensional finite difference algorithms, but a 20 dimensional pricing execution is easy to set up.
If I understand the question, you are asking for algorithms that are theoretically better but practically worse in all situations. Therefore, one would not expect them to actually be used unless by mistake.
One possible example is universal memoization. Theoretically, all deterministic function calls should be memoized for all possible inputs. That way complex calculations could be replaced by simple table lookups. For a wide range of problems, this technique productively trades time for storage space. But suppose there were a central repository of the results of all possible inputs for all possible functions used by all of humanity's computers. The first time anyone anywhere did a calculation it would be the last time. All subsequent tries would result in a table lookup.
But there are several reasons I can think of for not doing this:
The memory space required to store all results would likely be impossibly large. It seems likely the number of needed bits would exceed the number of particles in the universe. (But even the task of estimating that number is daunting.)
It would be difficult to construct an efficient algorithm for doing the memoiztion of that huge a problem space.
The cost of communication with the central repository would likely exceed the benefit as the number of clients increase.
I'm sure you can think of other problems.
In fact, this sort of time/space trade-off is incredible common in practice. Ideally, all data would be stored in L1 cache, but because of size limitations you always need to put some data on disk or (horrors!) tape. Advancing technology reduces some of the pain of these trade-offs, but as I suggested above there are limits.
In response to J.F. Sebastian's comment:
Suppose that instead of a universal memoization repository, we consider a factorial repository. And it won't hold the results for all possible inputs. Rather it will be limited to results from 1
to N!
Now it's easy to see that any computer that did factorials would benefit from looking up the result rather than doing the calculation. Even for calculating (N+1)!
the lookup would be a huge win since that calculation would reduce to N!(N+1)
.
Now to make this "better" algorithm worse, we could either increase N or increase the number of computers using the repository.
But I'm probably not understanding some subtlety of the question. They way I'm thinking of it, I keep coming up with examples that scale well until they don't.
Coppersmith–Winograd algorithm for square matrix multiplication. Its time complexity is O(n2.376) vs. O(n3) of a naive multiplication algorithm or vs. O(n2.807) for Strassen algorithm.
From the wikipedia article:
However, unlike the Strassen algorithm, it is not used in practice because it only provides an advantage for matrices so large that they cannot be processed by modern hardware (Robinson 2005).
Iterative Deepening
When compared to a trivial depth-first search augmented with alpha-beta pruning an iterative deepening search used in conjunction with a poor (or non-existent) branch ordering heuristic would result in many more nodes being scanned. However, when a good branch ordering heuristic is used a significant portion of the tree is eliminated due to the enhanced effect of the alpha-beta pruning. A second advantage unrelated to time or space complexity is that a guess of the solution over the problem domain is established early and that guess gets refined as the search progresses. It is this second advantage that makes it so appealing in many problem domains.
The Spaghetti sort is better than any other sorting algorithm in that it is O(n) to set up, O(1) to execute and O(n) to extract the sorted data. It accomplishes all of this in O(n) space complexity. (Overall performance: O(n) in time and space both.) Yet, for some strange (obvious) reason, nobody uses it for anything at all, preferring the far inferior O(nlogn) algorithms and their ilk.