views:

131

answers:

1

I am reading a blogpost at: http://flyingfrogblog.blogspot.com/2009/07/ocaml-vs-f-burrows-wheeler.html

a simple implementation for Burrow Wheeler compressing algorithm:

# compare two strings str[i..end,0..i-1] and str[j..end,0..j-1]
let cmp (str: _ array) i j =
  let rec cmp i j =
    if i=str.Length then 1 else
      if j=str.Length then -1 else
        let c = compare str.[i] str.[j] in
        if c<>0 then c else
          cmp (i+1) (j+1)
  cmp i j
# sort n strings
let bwt (str: byte array) =
  let n = str.Length
  let a = Array.init n (fun i -> i)
  Array.sortInPlaceWith (cmp str) a
  Array.init n (fun i -> str.[(a.[i] + n - 1) % n])

This implementation seems quite efficient, but actually slow because the sorting Array.sortInPlaceWith (cmp str) a uses a closure function (cmp str), and calls it too many times (O(n log n) on average)!

By making both sorting algorithm inline and the comparison function inline, the speed is fast.

My question is that does inline function mean that the seeming-closure call is not a closure anymore?

Another thing I am thinking is function pointers in C. When we use qsort:

void qsort ( void * base, size_t num, size_t size, int ( * comparator ) ( const void *, const void * ) );

we need to pass in the pointer of a comparison function. It seems that in C's case, the speed does not suck much.

Thanks!

+4  A: 
we need to pass in the pointer of a comparison function. It seems that in C's case, the speed does not suck much. 

It does if you compare it to a C++-implementation of std::sort.

You can think of the C++-version as the inlined code mentioned above. Through the use of templates, you don't need runtime indirection to invoke the function pointer but the compiler can directly insert and optimize the given comparison predicate at compile-time.

In case of your above F#-code, the first implementation will require the compiler to generate a closure object that is invoked through indirection at runtime whereas the inlined version won't need indirection since it's implementation is known at compile-time. (But since .NET's JIT-compiler can even do such optimizations at runtime, I'd never thought the difference would be that big)

Dario
Thanks. I have quite a lot of experience with qsort and std::sort. sort is faster, but only a little. Jon Harrop said his inline version is about 100 times faster. Maybe that's the older version of F# compiler, which is still evolving.
Yin Zhu