ansaurus

Question

Is quick sort followed by binary search faster than linear search?

Answer 1

+6 A:

No, because Quick Sort is slower than linear, and you add some more.

O(n*log(n)) + O(log(n)) = O(n*log(n)) > O(n)

Edit: Not precisely correct as pointed out, as worst case of Quick Sort is even slower with O(n^2). n*log(n) is best case and average case, which provides a lower bound is already slower than linear. The correct symbol might though be Θ instead of O.

Eiko 2010-07-04 20:02:09

Is it valid to use Big O with comparison operators like that?

Nabb 2010-07-04 20:16:28

Hmmm... Not sure I'd recommend that in an exam, but it's easy to read and understand.

Eiko 2010-07-04 20:23:06

@Nabb: Sure, but they don't act quite like ordinary numbers. Recall that big-O is in the asymptotic limit of large n, so `O(log n) << O(n) << O(n*log n)` and you can simple neglect non-leading terms.

dmckee 2010-07-04 20:25:33

@Nabb: It's fine as long as we're clear on what it means. It means that given two fixed functions f(n) and g(n) which are ϴ(n log n) and ϴ(n) respectively (not just O, which is an upper bound), then for *sufficiently large n*, f(n) > g(n). Of course, for smaller n it may be the other way around, and if the bounds are only upper bounds, then the function that is O(n log n) may actually be much smaller (in the case of sorting, we know it's not).

ShreevatsaR 2010-07-04 20:35:25

Quick sort is not `O(n*log(n))`, it's `O(n^2)`.

R.. 2010-07-05 05:39:43

Indeed, worst case is O(n^2). Best and average is Θ(n*log(n)). I edit to use Θ instead of O.

Eiko 2010-07-05 06:09:48

Answer 2

+20 A:

It depends how often you want to search after the sort - if only once, then a linear search will probably be faster. Of course, an even better bet is normally (but not always) to maintain things in sorted order using something like set or a map.

anon 2010-07-04 20:02:29

This is the best answer.

DeadMG 2010-07-04 20:04:48

Depends on the size of the container. For small containers, a linear search is faster than a binary search.

Thomas Matthews 2010-07-04 20:06:16

@Thomas It depends on lots of things.

anon 2010-07-04 20:09:05

+1 - everyone knows (or should know) that O(n*log(n)) > O(n)... the real question is _how many linear searches do you have to do before the benefit of sorting kicks in?_

D.Shawley 2010-07-04 20:09:56

How is a set sorted in any way?

Eiko 2010-07-04 20:21:03

@Eiko In the same way that any binary tree is sorted.

anon 2010-07-04 20:23:44

@Eiko: Most languages provide one or more implementations of set. They are either implemented as an ordered data structure (e.g. BST or Skiplist) or an unordered hashtable. Both options are equally applicable in this case IMO.

MAK 2010-07-04 20:30:16

@MAK - The question tags list C and C++. C has no standard library called "set". C++ has one, and it is an ordered container. A C++ unordered set is called "unordered_set" - and even that's TR1, which isn't strictly standard IIRC.

Steve314 2010-07-04 20:43:27

why "probably" ? a sort must at least read the whole list ...

6502 2010-07-04 21:28:18

@6502 Because it depends on the specific implementations of the sort, the binary chop search and the sequential search.

anon 2010-07-04 21:32:16

@Neil - hard to imagine a sane linear search thats slower than a sane sort. Even in the best special case, where the items are already sorted and the sort algorithm handles that well, a check is still needed for every item to confirm that they're already sorted.

Steve314 2010-07-04 21:47:08

very improbable: linear search starts from the end up, destroying sequentiality. sort wins by cache hits. though this still doesn't warrant the use of "probably" :)

aib 2010-07-04 23:22:59

@Steve314: I don't see C or C++ in the question tags.

MAK 2010-07-05 12:36:18

@MAK That's because it was edited by a nitwit - I've rolled it back. And when making comments like this you should always check the edit history first.

anon 2010-07-05 12:41:07

@MAK - either I'm finally going completely insane (previously, I was only sure of "mildly insane"), or the tags have been changed. FWIW, I don't remember the search, efficiency, speed or performance-comparison tags being there - only the algorithm tag and (as I said) C and C++. Maybe the author (or someone else) realised the implementation language isn't really relevant to an algorithm issue, and edited?

Steve314 2010-07-05 12:42:59

Aaaarrrggghhh!!! - everything keeps changing! It's worse than a requirements doc!

Steve314 2010-07-05 12:51:39

@steve314, @Neil Butterworth: Yes, I see that the tags did actually include c++ and c when I posted my first comment, so you are right.

MAK 2010-07-05 13:05:13

@aib - a linear search in an array is near perfect for cache. Possibly the most cache-friendly sort is the bubble-sort (yuck!), and quicksort is IIRC fairly good, but not as good as a linear search. Perhaps you're thinking of a linear search of a linked list? (if sort-then-binary-search is a viable option this seems unlikely, though the question doesn't specify the form of the input).

Steve314 2010-07-05 16:33:26

@Steve314 I was thinking of a linear search starting from the end going to the beginning of the array - it's still linear, right? :)

aib 2010-07-06 09:48:44

@aib - yes, and just as near-perfect for cache. The cache doesn't care whether you step forwards or backwards, as long as each step is normally within the same cache page. If that wasn't true, quicksort would be very bad for cache - each partition steps two pointers, one moving forwards through the array, the other backwards, both meeting at the pivot. Since forwards vs. backwards isn't an issue for the cache, the only issue is you typically have more relevant cache pages in use at any time than for linear search through an array.

Steve314 2010-07-06 18:46:27

Ah right, I was thinking more of a "read the following couple of bytes" cache - not in terms of pages. Still, it was a bit tongue-in-cheek; you cannot beat a linear search.

aib 2010-07-07 12:12:11

Answer 3

A:

It depends on the input. Linear search is usually faster for small number of elements.

frunsi 2010-07-04 20:02:43

A _single_ linear search is *always* faster than a single sort followed by a single binary search. However, there is a point at which a single sort followed by multiple binary searches will be faster than multiple linear searches.

D.Shawley 2010-07-04 20:12:30

Answer 4

+2 A:

Linear search is O(n), sorting is O(n log n), binary search is O(log n). Thus, if you expect to search more than log n times, sorting is better. (Of course this is just theoretical performance, on practice you're also interested in constants, esp. for not-too-large n)

Roman Cheplyaka 2010-07-04 20:05:44

Answer 5

+2 A:

Forgetting the binary search for a minute:

Quick sort best case: O(n log n)

Quick sort worst case: O(n^2)

Linear search worst case: n

So regardless of what you do after quick sort, you'd already have found the item if you'd done a linear search. However, this only holds for the first search. Say you had more than one search to do. If you retain the (Quick)sorted list, then all following searches have a worst case of O (log n), so over several searches (particularly on larger datasets), sorting and searching would be wise. As always with programming and software design in general, there is never a 'one size fits all', so thinking about what patterns and implementations are most appropriate to your particular domain in essential.

Noel M 2010-07-04 20:08:02

ansaurus

tags:

views:

answers:

Is quick sort followed by binary search faster than linear search?

related questions