Binary Searching

views:

254

answers:

+2 Q:

Hi So, I want to understand more about binary searching, cause I don't really understand. Binary search requires a precondition that an array is sorted. I got that right? It seems like a method should check this precondition and throw an exception if it is not met. But, why is checking the precondition a bad idea?

+7 A:

It's a bad idea because checking the data is sorted takes n steps. The whole search is about log(n) steps.
If you're going to check, you might as well do a linear search.

Kobi 2009-10-17 21:08:28

IIRC, the SO policy is a bit unclear (in the sense that there is no policy); however, I think we should try to not give the literal answer to a homework question, but perhaps reformulate the question to make the poster think in the right direction (after the poster admits it actually is homework).

Martin v. Löwis 2009-10-17 21:14:22

@Martin see http://meta.stackoverflow.com/questions/10811/homework-on-stackoverflow .. I'm fed up with this paranoic homework thing. The question doesn't ask for premade solutions, indeed it it's exactly what SO is made for; getting help for problems you can't solve alone.

svens 2009-10-17 21:34:41

Agree with svens - this is the type of answer that any lecturer or classmate would give anyways.

carl 2009-10-17 22:22:55

+1 A:

Binary search assumes that input data are sorted. So here you are correct.

Now in general checking if data are sorted some time. So executing this before every search would make searching really inefficient.

More details.

Assume that 'n' is amount of your data.

Binary searching needs O(log(n)) operation in worst case to find an element. Making sure that data are sorted requires O(n) operations.

So if we will check precondition every time for very big n we will start spending most of time on checking precondition than doing actual search.

And it is not so hard to say when you will see such an effect. I just calculated how much time you will spend on pre-cheking vs. actual searching

For 1 element you spend no time searching.
For 2 elements you spend 50% on searching.
For 5 elements you spend 46% on searching
For 20 elements you spend 22% on searching.
For 100 elements you spend 7% on searching.

And so on. In every case rest on time is spend on precondition check.

Michal Sznajder 2009-10-17 21:09:45

+3 A:

Yep, a binary search involves 0(log n) steps and verifying the whole sequence is sorted involves 0(n) steps. From my point of view it is good to verify it in DEBUG mode, not during RELEASE.

yves Baumes 2009-10-17 21:11:08

Complexity is O(log(n)), but needed steps is Floor(Log_2(n))+1

ralu 2009-10-17 22:55:47

+6 A:

The whole point of a binary search is that, since the data is already sorted, you can quickly locate the information you want.

Take the phone book, which is sorted by last name.

How do you find someone in the phone book? You open it up to a page which you assume will be close to what you want, and then start flipping pages.

But you don't flip one page each time, if you missed by a lot, you flip a bunch of pages, and then finally start flipping one at a time, until finally you start looking at a single page.

This is what binary search does. Since the data is sorted, it knows it can skip a lot and do another look, and it'll focus in on the information you want.

A binary search does 1 comparison for every doubled number of items. So a 1024 element collection would require around 10 comparisons, at the most, to find your information, or at least figure out that it's not there.

If you, before running the actual binary search, does a full run-through to check if the data is sorted, you might as well just do a scan for the information. A full run-through + the binary search will require N + log2 N operations, so for 1024 elements, it would require around 1034 comparisons, whereas a simple scan for the information will on average require half, which is 512.

So if you can't guarantee that the data is sorted, you should not use binary search, as it will be outperformed by a simple scan.

Edit: I will say this though, you could add a debug-only code step to verify this, to catch bugs in the code that is supposed to prepare the data for binary search, but know that, due to what I've written above, this will make the total running time a lot higher, so depending on what you want to do with this check, you might or might not want to add it. But it should not be present in release code.

Lasse V. Karlsen 2009-10-17 21:11:27

In addition to what everyone else said about running time (O(n) to check all the items, vs. O(log(n)) to run binary search.)

I think you are misunderstanding the idea of a pre-condition. Pre-conditions and post-conditions are a contract. If your precondition is true, and you run your algorithm, then your post condition will be true. If your precondition is false, then you make no guarantees on the post condition.

So basically, binary search says this: If the data you give me is already sorted, then I can tell you the position of a specific piece of data, or if it is not present, by performing approximately log(n) checks. If the data is not sorted, I make no guarantees about my answer.

The work that takes you from your pre-condition to your post-condition if your algorithm. In this case, binary search.

Rob Rolnick 2009-10-17 21:17:10

The original question presupposes that you are using a binary search on a collection of data. That is not always the case. Many times you are just trying to compute a number in some interval.

Suppose you are trying to calculate the optimal speed setting for a fan. For some reason you can't find a closed form expression, so you simulate the air flow at different speed settings.

Assuming that the fan can run at any speed from 0RPM to 5000RPM, you don't actually have to generate a list of possible speeds. You just find the average of the previous minimum and maximum at each step of the binary search.

Jørgen Fogh 2009-10-17 22:19:38

ansaurus

tags:

views:

answers:

Binary Searching

More details.

related questions