views:

782

answers:

4

Hello. I have written 2 implementation of bubble sort algorithm in C and Haskell. Haskell implementation:

module Main where
main = do
    contents <- readFile "./data"
    print "Data loaded. Sorting.."
    let newcontents = bubblesort contents
    writeFile "./data_new_ghc" newcontents
    print "Sorting done"
bubblesort list = sort list [] False
rev  = reverse          -- separated. To see
rev2 = reverse          --  who calls the routine
sort (x1:x2:xs) acc _
    | x1 > x2           = sort (x1:xs) (x2:acc) True
sort (x1:xs) acc flag   = sort xs (x1:acc) flag
sort [] acc True        = sort (rev acc) [] False
sort _ acc _            = rev2 acc

I've compared these two implementations having run both on file with size of 20 KiB.
C implementation took about a second, Haskell — about 1 min 10 sec.
I have also profiled the Haskell application:
Compile for profiling:

C:\Temp> ghc -prof -auto-all -O --make Main

Profile:

C:\Temp> Main.exe +RTS -p

and got these results. This is a pseudocode of the algorithm:

procedure bubbleSort( A : list of sortable items ) defined as:
    do
        swapped := false
        for each i in 0 to length(A) - 2 inclusive do:  
            if A[i] > A[i+1] then  
                swap( A[i], A[i+1] )  
                swapped := true  
            end if  
        end for  
    while swapped  
end procedure  

I wonder if it's possible to make Haskell implementation work faster without changing the algorithm (there are actually a few tricks to make the algorithm work faster, but neither implementations have these optimizations).

+8  A: 

It's probably because bubble sort is an algorithm for arrays, but you're using a linked list: swapping two items in an array (which is what C uses) is O(1) time and requires no extraneous space, but swapping two items in a linked list (which is what Haskell uses) is O(n) time and O(n) space (and this is heap space, not stack space). However, I'm having a little trouble following your code (are you absolutely sure it's the same algorithm?), and it's possible your accumulator deals with the swap's time complexity. However, even if it does, you're allocating a new accumulator list; this will definitely still allocate extra space, and I think this may very well still be one of the reasons for Haskell's worse performance.

Also, why do you have rev and rev2, rather than just using reverse in both places? If it's because you wanted to profile them separately, then you should instead use the SCC ("Set Cost Center") pragma, described in chapter 5 of the GHC User's Guide: sort ({-# SCC "rev" #-} reverse acc) [] False and {-# SCC "rev2" #-} reverse acc. Each cost center is profiled separately; -auto-all is just implicitly adding cost centers around each function. If you have these functions for some other reason, you still probably shouldn't (why do you?), but my explanation is irrelevant :)

Antal S-Z
Your assumptions as to profiling (`rev` and `rev2`) are correct. And you possibly right about extra memory allocation (total alloc = 696,719,076 bytes — info from Main.prof) — this is kind of huge amount for 5 KiB of initial data. So I think there should be a way to optimize elements transposition.
Yasir Arsanukaev
Not on a pure, immutable, linked list. Really, you're not comparing the same algorithm in some sense; swapping behaves sufficiently differently in the two cases that it would make more sense to use an array. And since bubblesort requires destructively modifying the list, you probably want an impure array (possibly Data.Vector, from http://bit.ly/ahCrDd , although I've never used it). This would get you the fairest algorithm-to-algorithm comparison, since the data structures are the same. Different data structures require different algorithms (e.g., quicksort is awful on a linked list).
Antal S-Z
I see. Thanks for the answer but I'm unable to vote so far :]
Yasir Arsanukaev
+6  A: 

Since bubble sort has quite poor memory locality properties, I think it's likely that your implementations are memory bandwidth-bound, though I have not done any testing.

The native Haskell String = [Char] is extremely convenient, but not suitable when performance is the main priority. I forget the exact numbers, but I'm sure a String uses at least 16 bytes per character on a 32-bit system, and roughly double that on a 64-bit system. By contrast I assume your C program uses 1 byte per character. So, one can expect a very approximately 16-fold slowdown from this alone.

You're also emulating a mutable array by means of a pair of linked lists (this is usually called a "zipper"). This is reasonably efficient as long as you are making local modifications around a moving focus, which is true for the most part in bubble sort. However, at the end of each pass, you need to move the focus back to the start (this is rev). This probably accounts for another factor of 2 in the amount of work done in the algorithm.

So, I don't think your benchmarks are very surprising.

If you want to write a fast bubblesort in Haskell, the way to go would be to implement your pseudocode directly in the ST monad using a mutable unboxed array. I don't think you will see a bubblesort in an entirely pure language with a much better constant factor any time soon (though I would love to be proven wrong of course).

Reid Barton
It would be an interesting experiment to modify the Haskell implementation to use a specialized list type strict in the list elements (`data CharList = Empty | Cons !Char Charlist`).
Reid Barton
+4  A: 

You might express the algorithm more directly in Haskell, without the reverse calls:

bubblesort :: (Ord a) => [a] -> [a]
bubblesort []      = []
bubblesort (x0:xs) = case bubble x0 xs of
   (xs', True)  -> bubblesort xs'
   (xs', False) -> xs'
   where bubble x1 (x2:xs) | x1 <= x2  = merge x1 False $ bubble x2 xs
                           | otherwise = merge x2 True  $ bubble x1 xs
         bubble x1 []                  = ([x1], False)
         merge x s (xs, s') = (x:xs, s || s')

Here, the local bubble function does the job of bubbling a value through, and the merge function handles both rebuilding the new, bubbled list, and the swap flag. The case expression in bubblesort is a direct expression in Haskell of "bubble through the list once, and if we swapped anything, do it again".

This version clocks in at about 35% faster than your original.

P.S.: Code and driver Main files can be found here: http://bitbucket.org/mtnviewmark/haskell-playground/src/tip/cafe/

MtnViewMark
I believe the use of vectors (http://haskell.org/haskellwiki/Numeric_Haskell:_A_Vector_Tutorial) could increase the performance to the extent which is close to one achieved by C. But I think it can also break the concept of functional programming since we'd end up with mutable structures. But yes, I'd prefer your version over mine: it works a bit faster.
Yasir Arsanukaev
Yasir: If the semantics you're composing together are simple and precise (and do need need mutation to be described), then the implementation is free to mutate things and it doesn't break the concept of functional programming.GHC converts your pure programs into non-pure programs that mutate all over the place...
Peaker
+2  A: 

I notice that you passed -O instead of -O2. You might get some speed up from -O2.

As others mention, this is not the same algorithm in C and Haskell because the data structures differ--Haskell uses an immutable linked list of Unicode characters. What is your purpose?

  1. The fastest sort of any kind in many languages?
  2. The fastest bubble sort in many languages?
  3. The fastest idiomatic bubble sort in many languages?

I guess (1) is not the case since you are using bubble sort in C as well. If (2), then you should be using STUArray of Word8s, which will give you an algorithm close to the C version (though probably uglier and slower). If (3), I am sort of confused, because bubble sort only really looks idiomatic in BASIC (the old kind, where you capitalise the entire word, not like VisualBasic with only two capitals). You should expect elegance and performance to degrade in direct proportion with the language's similarity to BASIC.

Whatever the case, it would be interesting to see the performance of a immutable linked-list recursive bubble sort of Unicode strings in C.

Nathan Sanders
You all chaps are right here. Possibly Haskell, Erlang or any other functional language are not good at solving this kind of problems which are naturally solved using mutable structures. I think the only elegant and efficient solution here (and probably for most of other problems of this kind) would be to use C for sorting via FFI or some other interface.
Yasir Arsanukaev
If you are willing to give up a little elegance and efficiency for convenience, the mutable data structures in Haskell are not that bad. They just don't look as pretty as the rest of the language.
Nathan Sanders