views:

788

answers:

11

Hello,

I recently heard about ternary search in which we divide an array into 3 parts and compare. Here there will be two comparisons but it reduces the array to n/3. Why don't people use this much?

+10  A: 

A ternary search will still give you the same asymptotic complexity O(log N) search time, and adds complexity to the implementation.

The same argument can be said for why you would not want a quad search or any other higher order.

Akusete
@Nikita: Log2 N = Log3 N / Log3 2 = Constant * Log3 N = O(Log N). There is only ever a constant factor difference between any two logarithmic orders of complexity.
Akusete
+1  A: 

Here's some random experimental evidence that I haven't vetted at all showing that it's slower than binary search.

Steven Schlansker
+13  A: 

Actually, people do use k-ary trees for arbitrary k.

This is, however, a tradeoff.

To find an element in a k-ary tree, you need around k*ln(N)/ln(k) operations (remember the change-of-base formula). The larger your k is, the more overall operations you need.

The logical extension of what you are saying is "why don't people use an N-ary tree for N data elements?". Which, of course, would be an array.

Borealid
B-trees are k-ary trees that sit between arrays and binary trees, and are commonly used; there's certainly a purpose for k-ary trees that are larger than order 3.
Dean J
+1  A: 

"Terinary" (ternary?) search is more efficient in the best case, which would involve searching for the first element (or perhaps the last, depending on which comparison you do first). For elements farther from the end you're checking first, while two comparisons would narrow the array by 2/3 each time, the same two comparisons with binary search would narrow the search space by 3/4.

Add to that, binary search is simpler. You just compare and get one half or the other, rather than compare, if less than get the first third, else compare, if less than get the second third, else get the last third.

cHao
+6  A: 

Searching 1 billion (a US billion - 1,000,000,000) sorted items would take an average of about 15 compares with binary search and about 9 compares with a ternary search - not a huge advantage. And note that each 'ternary compare' might involve 2 actual comparisons.

Michael Burr
The fact that a 'k-ary compare' in fact consists in 'k' key comparisons is possibly the most important factor to be mentioned in answering this question. +1
Alceu Costa
+3  A: 

What makes you think Ternary search should be faster?

An approximate calculation:

Number of comparisons in ternary search = 2 * log(n)/log3 ~ 4.19*logn.
Number of comparisons in binary search = log(n)/log(2) ~ 3.32*logn.

So it looks like ternary search is worse in the worst case.

Moron
@Moron: Would it be `2 * log(n)/log3`? I do not always need to make 2 comparisons at a level. For example - if I have `[..a..b..]`, and I do `x < a` is true, then there is no need to compare with `b`.
Lazer
@Lazer: He did say 'worst case'. Assuming a balanced tree, and an even distribution of lookups (shakey assumption), you would need 1.5 lookups per node on average*.
Akusete
+4  A: 

The only way a ternary search can be faster than a binary search is if a 3-way partition determination can be done for less than about 1.55 times the cost of a 2-way comparison. If the items are stored in a sorted array, the 3-way determination will on average be 1.66 times as expensive as a 2-way determination. If information is stored in a tree, however, the cost to fetch information is high relative to the cost of actually comparing, and cache locality means the cost of randomly fetching a pair of related data is not much worse than the cost of fetching a single datum, a ternary or n-way tree may improve efficiency greatly.

supercat
+2  A: 

Also, note that this sequence generalizes to linear search if we go on

Binary search
Ternary search
...
...
n-ary search ≡ linear search

So, in an n-ary search, we will have "one only COMPARE" which might take upto n actual comparisons.

Lazer
Except there's some very happy middle ground between those two extremes. See my answer, maybe?
Dean J
@Dean J: Yes, you are right. If there was inbuilt support such that there could be a atomic ternary compare, ternary search would have been the choice. But there are a lot of limitations in implementing ternary logic at the hardware level itself (one reason why it isn't popular). I remember reading somewhere that `e` is the most economical base theoretically though it is still a problem to actually do that in hardware. So, until we have machines that support ternary logic at the hardware level, binary is the way to go.
Lazer
@Lazer: The interesting one is that binary search is optimal for CPU cycles, but not optimal if the tree is large enough that some of it is paged to disk; in that case, a n-ary tree with a *huge* branching factor actually works much better.
Dean J
A: 

You may have heard ternary search being used in those riddles that involve weighing things on scales. Those scales can return 3 answers: left is lighter, both are the same, or left is heavier. So in a ternary search, it only takes 1 comparison. However, computers use boolean logic, which only has 2 answers. To do the ternary search, you'd actually have to do 2 comparisons instead of 1. I guess there are some cases where this is still faster as earlier posters mentioned, but you can see that ternary search isn't always better, and it's more confusing and less natural to implement on a computer.

muddybruin
A: 

Theoretically the minimum of k/ln(k) is achieved at e and since 3 is closer to e than 2 it requires less comparisons. You can check that 3/ln(3) = 2.73.. and 2/ln(2) = 2.88.. The reason why binary search could be faster is that the code for it will have less branches and will run faster on modern CPUs.

Daniel Velkov
+2  A: 

Wow. The top voted answers miss the boat on this one, I think.

Your CPU doesn't support ternary logic as a single operation; it breaks ternary logic into several steps of binary logic. The most optimal code for the CPU is binary logic. If chips were common that supported ternary logic as a single operation, you'd be right.

B-Trees can have multiple branches at each node; a order-3 B-tree is ternary logic. Each step down the tree will take two comparisons instead of one, and this will probably cause it to be slower in CPU time.

B-Trees, however, are pretty common. If you assume that every node in the tree will be stored somewhere separately on disk, you're going to spend most of your time reading from disk... and the CPU won't be a bottleneck, but the disk will be. So you take a B-tree with 100,000 children per node, or whatever else will barely fit into one block of memory. B-trees with that kind of branching factor would rarely be more than three nodes high, and you'd only have three disk reads - three stops at a bottleneck - to search an enormous, enormous dataset.

Reviewing:

  • Ternary trees aren't supported by hardware, so they run less quickly.
  • B-tress with orders much, much, much higher than 3 are common for disk-optimization of large datasets; once you've gone past 2, go higher than 3.
Dean J