Algorithm choices - understanding which and why

views:

201

answers:

Algorithm choices - understanding which and why

I am writing a document and was unsure of the following:

I am going to compare two algorithms that can perform on the same structure but we cant say one will always be faster than the other (and I will define circumstances in which a is better than b and vice versa); for this I was going to use quicksort and bubblesort. Is this a good choice?
Pick 2 algorithms that work on large datasets and define why one is significantly better than the other. For this I was going to use maybe linear search and binary chop search.

What are your opinions on the algorithms I have chosen to explain these points, do they seem appropriate?

Thanks, everyone!

+1 A:

No because quicksort is demonstrably better than a bubble sort in all but a very few set of circumstances involving extremely small datasets.

Quicksort is an O(n log n) algorithm. Bubble sort is an O(n²) algorithm.

Pick one of the other O(n log n) sorts like an merge sort or heap sort.

Or compare bubble sort to a selection sort or insertion sort, both O(n²).

cletus 2009-12-15 09:57:24

insert sort is also O(n^2). but it is very fast compared to bubblesort.

Yin Zhu 2009-12-15 10:00:23

And quicksort is O(n^2) on pre-ordered data if naively implemented

Michael Borgwardt 2009-12-15 10:06:00

Oops, getting my sorts mixed up.

cletus 2009-12-15 10:08:23

so maybe quick sort and merge sort for 1?

tommy 2009-12-15 10:15:06

Would love to know why this got downvoted twice...

cletus 2009-12-15 10:44:57

Quicksort is O(n log n) on average, but is O(n^2) worst case. Bubble sort is O(n^2) both on average and worst case.

Jesper 2009-12-15 12:27:08

I downvoted it because of the false statement, "quicksort is demonstrably better than a bubble sort in all but a very few set of circumstances involving extremely small datasets". Every quicksort has an O(n^2) worst case, on any size of data, although it might be very difficult to construct, and O(n log n) best case. Every bubble sort has an O(n) best case, on any size of data. Then I saw your "oops" comment and removed the downvote. Now I see that although you're aware of the error, you haven't corrected it. So I'm thinking of putting my downvote back.

Steve Jessop 2009-12-15 12:33:58

@Steve: thanks for the feedback, I appreciate that. As for worst/expected case, sure they're both O(n^2) but the combinatorially speaking the worst case is *highly* unlikely with even small data sets. Put it this way: if you have 100 elements you have 100! combinations. How many of those will result in O(n^2) for quicksort?

cletus 2009-12-15 12:40:21

Sure, and choosing the pivot at random (or median-of-several-random) means that even a malicious caller can't construct a worst-case data set. But the assignment is "two algorithms that can perform on the same structure but we can't say one will always be faster than the other". The existence of a case where bubblesort is better than quicksort is obvious (sorted and some near-sorted data), as is the existence of a case where quicksort is faster (almost anything else). Even on large data sets. The fact that bubblesort is worse almost always is true, but irrelevant to the question ;-)

Steve Jessop 2009-12-15 13:36:14

Just FWIW, Bubblesort originally became known because it's provably the best sort algorithm possible for a specific class of machine. In fact, one of the first real computer science papers ever published proved exactly this.

Jerry Coffin 2009-12-15 15:53:14

Specifically, one which didn't have constant-time access to the entities being sorted. Normally with comparison sorts and big-O, we're counting the number of comparisons, and assuming that accessing memory is negligible...

Steve Jessop 2009-12-15 18:14:38

+2 A:

comparing quicksort and bubblesort is probably not a good idea. bubblesort even may not beat quicksort on small cases.

at least try quicksort and insertsort.

I would like to try Prim and Kruskal minimum spanning tree algorithms to show the strength and weakness of the two algorithms on dense and sparse graphs.

2) comparing binary search and linear search is a good example here.

Yin Zhu 2009-12-15 09:58:21

"bubblesort even may not beat quicksort on small cases". But if you choose an utterly naive bubblesort (is there any other kind?) and an utterly naive quicksort (which selects the leftmost element as the pivot), then there is a simple case where bubblesort is faster - already-sorted data. So I think they do satisfy the requirements of the assignment.

Steve Jessop 2009-12-15 12:28:21

@Steve I was considering average case.

Yin Zhu 2009-12-15 13:30:32

I don't think the assignment is about the average case (at least, part 1 of it isn't). It's about one not *always* being faster than the other.

Steve Jessop 2009-12-15 14:19:22

+1 A:

This hugely depends on the exact assignment. Both cases you have presented seem too obvious.

The difference between a linear search and binary search is so large and obvious (on large datasets) that it does not require any discussion at all...unless this is a very basic course.

Marek 2009-12-15 09:59:09

Study quicksort and bubble sort and find an answer - it might be that one is better than the other under certain circumstances.

Wikipedia has a lot of useful information on sorting algorithms, read and study it carefully.

The most important aspects to compare algorithms on are computational complexity and memory use; see also analysis of algorithms.

Jesper 2009-12-15 10:01:50

The common way to compare algorithms is by using the Big-O-notation (Landau Notation) to describe the limiting behaviour of a function.

Just in (very) brief:

Assume, you have a function that takes n elements (sorting algorithm: the size of your collection). Now we look at the algorithm and try to find out, which is the maximum number of 'steps' (related to n) it needs to terminate.

Some algorithms terminate in constant time (Hashtable read), which is O(1). If you have an algorithm, that has to 'look' at each element once, it is O(n) (linear), if you need to look at it twice, then it's O(2n) (still linear). The link I provided has more examples.

For common algorithms, the O-notations are well known. But it's not too difficult to analyze an existing algorithm. (The real challenge is to invent algorithm for a given problem that are faster then the already known ;))

Andreas_D 2009-12-15 10:14:26

+1 A:

Since this must be homework or something very similar I think you would be best advised to choose your own algorithms to compare as you have already. Your marks will depend much more on the quality of your analysis than on your selection of algorithms. There is already some good advice here on how to set about your task, I won't repeat or even add to it further.

Regards

Mark

High Performance Mark 2009-12-15 10:27:43

For (1) I'd suggest a tree map with O(Log N) access to an Hashmap (O(1) access).

The basic "O" seems to clearly favorite hashmap, but then:

tree is bound by log N, whereas the worst case for Hashmap is O(N) - when everything lands in one bucket
tree organically grows to fit any number of elements. Hashmap gets inefficient with a bad ratio of nItems / nBuckets, in which case you have to rehash

There are a few other things xyou could figure out... :)

peterchen 2009-12-15 10:32:55

ansaurus

tags:

views:

answers:

Algorithm choices - understanding which and why

related questions