views:

258

answers:

7

In binary search algorithm we have two comparisons:

if (key == a[mid]) then found;

else if (key < a[mid]) then binary_search(a[],left,mid-1);
      else binary_search(a[],mid+1,right);

Is there a way by which I can have only one comparison instead of the above two.

--

Thanks

Alok.Kr.

A: 

This is recursive algorithm. First comparison is stop criteria and second actual search, so you cannot remove them.

In first you asking whenever you have already found the element and in second in which part of the array to look for element. So you cannot make those decisions based only on one comparison.

Artem Barger
He can create a function that compares `key` and `mid[a]` and returns the sign of the comparison (an `int`). Comparing this `int` with fixed values (`-1`, `0` or `1`) will cost less (or at worst, as much) as comparing `key` and `mid[a]`, depending on what `key` and `mid[a]` are.
ereOn
+4  A: 

Yes. Just don't eliminate mid from the recursive call.

if ( left == right ) return NULL;
if ( left + 1 == right ) return key == a[left]? &a[left] : NULL;

mid = left + ( right - left / 2 );

if (key < a[mid]) return binary_search(a[],left,mid-1);
else return binary_search(a[],mid,right); // include `mid` in next round

You only need to eliminate half of the set with each recursion to achieve O(logN) performance. You're going above and beyond by eliminating half+1.

If you only use < during recursion, the algorithm will find the least element which is not less than key (but may be greater than key). Finish off by performing a single equality test.

Potatoswatter
Wouldn't you need another comparison (pre recursive call) to determine whether you're down to a single element and decide to perform the final equality test?
Damien_The_Unbeliever
@Damien: yep, I added that as you commented. But that test is not a *key* comparison, it's an index comparison, which presumably has simpler computational complexity.
Potatoswatter
+1  A: 

In assembler, you could:

cmp key,a[mid]
beq found
bge else

So if your compiler is really good at peephole optimizations, it might already do this for you.

Aaron Digulla
The compiler will probably do this because peepholes are something you can take for granted… anyway, the difference is insignificant. The question is important anyway, since it applies to cases where the comparison is an expensive function.
Potatoswatter
If the comparison is expensive, create a function which returns the sign of the comparison so you can examine the sign several times.
Aaron Digulla
That's not an option in C++, where the search is performed by a generic function which requires a `bool` typed functor.
Potatoswatter
If you want to optimize the performance here, you must widen the API. If you can't, then you can't improve the performance. Chose your poison, please :-)
Aaron Digulla
+8  A: 

See:

http://en.wikipedia.org/wiki/Binary_search_algorithm#Single_comparison_per_iteration

Taken from wiki:

   low = 0
   high = N
   while (low < high) {
       mid = low + ((high - low) / 2)
       if (A[mid] < value)
           low = mid + 1;
       else
            //can't be high = mid-1: here A[mid] >= value,
            //so high can't be < mid if A[mid] == value
            high = mid;
   }
   // high == low, using high or low depends on taste
   if ((low < N) && (A[low] == value))
       return low // found
   else
       return -1 // not found

Pros/cons from wiki: "This approach foregoes the possibility of early termination on discovery of a match, thus successful searches have log2(N) iterations instead of an expected log2(N) − 1 iterations. On the other hand, this implementation makes fewer comparisons: log2(N) is less than the expected number of comparisons for the two-test implementations of 1·5(log2(N) − 1), for N greater than eight."

lasseespeholt
I like that the linked article states pros/cons of this technique.
Markus Kull
Yes, I've included it to make the it explicit and clear, thanks...
lasseespeholt
A: 

First things first: do you need to optimize the program? Have you measured to know where you need to do it? Is it in this function?

For primitive types the second comparison is as fast an operation as it gets. The higher cost of the comparison is loading the element into the appropriate register, and that is needed for the first comparison. Once that comparison is executed, the value is already in a register and the second operation takes a single processor instruction plus the possible cost of the branch misprediction.

Assuming integral types, the cost in processor time of the algorithm is most probably dominated by the cost of the recursive calls if the compiler is not being able to perform tail-recursion optimization. If you really need to optimize this, try compiling with all the optimization flags on and analyze the assembler to identify whether the tail-recursion optimization is being applied. If not, manually convert the algorithm from recursive to iterative.

This will have two effects: obscure the code (avoid modifying a clean solution unless you really need to) and it avoid function calls.

If you are speaking of C++, and the type is complex and the overloaded comparison operators are expensive, the fastest boost in performance is implementing a compare method that will return a negative number for less-than, 0 for equal, and a positive number if greater-than. Then precompute the result before the comparisons and then perform integer only checks. That will remove the overall cost of the algorithm to a single processing of the real objects with the expensive comparison and set you back in the original assumption.

David Rodríguez - dribeas
A: 
for (step = 0; step < n; step <<= 1);

for (i = 0; step; step >>= 1)
    if (i + step < n && v[i + step] <= x)
        i += step;
Teodor Pripoae
A: 

Ok, this was a interview question in Adobe, and I was just trying to figure out how to do this.

Now I have got solution to it, so I,m posting

void binary_search (int a[] , int low ,  int high , int key )
{
    int mid = (low+high)/2;

    if (key == a[mid]) {
        printf ("Number Found\n");
        return;
    }

    else {
        int sign = Calc_sign (key-a[mid]);
        low = low*sign + (1-sign)*mid;
        high = mid*sign + (1-sign)*right;
        binary_search (a,low,high,key);
    }
}

int Calc_sign(int a) 
{
     return ( (a & 80000000) >> 31);
}

So in the code there will only be one comparison for checking if the keyvalue is eqaul to the mid element.

--

Thanks

Alok Kr.

Kumar Alok
A possible problem with such math-based comparisons (`key-a[mid]`) is that they could underflow. Plus it only works with numeric types.
UncleBens