ansaurus

Question

Answer 1

+9 A:

In your particular Fibonacci example, it's not very hard to see why the second one should run faster (although you haven't specified what f2 is).

It's mainly an algorithmic issue:

fib1 implements the purely recursive algorithm and (as far as I know) Haskell has no mechanism for "implicit memoization".
fib2 uses explicit memoization (using the fibArr list to store previously computed values.

In general, it's much harder to make performance assumptions for a lazy language like Haskell, than for an eager one. Nevertheless, if you understand the underlying mechanisms (especially for laziness) and gather some experience, you will be able to make some "predictions" about performance.

Referential transparency increases (potentially) performance in (at least) two ways:

First, you (as a programmer) can be sure that two calls to the same function will always return the same, so you can exploit this in various cases to benefit in performance.
Second (and more important), the Haskell compiler can be sure for the above fact and this may enable many optimizations that can't be enabled in impure languages (if you've ever written a compiler or have any experience in compiler optimizations you are probably aware of the importance of this).

If you want to read more about the reasoning behind the design choices (laziness, pureness) of Haskell, I'd suggest reading this.

3lectrologos 2010-01-14 14:31:28

the f2 thing was a bug - I entered the question from a different computer than I have haskell installed, allowing the error room to creep in.

Aidan Cully 2010-01-14 14:59:18

It's posts like these that make me want to read that Dragon book to understand exactly what all the fuss about Haskell's compiler is all about.

wheaties 2010-01-14 15:02:52

I'm starting to work through the History document you posted. I understand the point about that understanding the underlying mechanisms for the language are important in predicting performance... That's the question, though - what can be assumed about those underlying mechanisms? Do specifications exist? Or do we assume that these things can change? For example, it's easy to imagine a Haskell implementation that doesn't memoize anything. Would that be a valid implementation? Or can I assume that lists are memoized in all cases?

Aidan Cully 2010-01-14 20:35:14

I imagine this is covered in the books, but it occurs to me that Haskell, or any purely functional language, *could* have implicit memoization. The hard part, as a compiler writer, would be knowing when to use it.

Dan 2010-01-14 21:16:31

The Haskell specifications refer only to mechanisms that are implementation independent (e.g. referential transparency). Of course a compiler could use that to implement an implicit memoization mechanism, but this wouldn't be specified in a Haskell spec. One should read the specific compiler spec to really learn about the implementation details.

3lectrologos 2010-01-14 22:15:02

In a compiled language like Haskell, it's very difficult to use dynamic optimization techniques (always memoizing results is clearly not an option), because you usually don't have enough information at compile time (it's also one of the reasons behind the popularity of virtual machines, which use dynamic profiling).

3lectrologos 2010-01-14 22:16:32

So, basically, don't bother trying to predict Haskell's performance, beyond the worst possible case. Rather, concentrate on predicting the behavior of your chosen compiler.

Aidan Cully 2010-01-14 22:41:33

Haskell may be "special" because of the wide variety of optimizations and pessimizations available to the compiler, but this lesson is true of all languages.

ephemient 2010-01-14 23:49:19

Answer 2

+4 A:

Reasoning about performance is generally hard in Haskell and lazy languages in general, although not impossible. Some techniques are covered in Chris Okasaki's Purely Function Data Structures (also available online in a previous version).

Another way to ensure performance is to fix the evaluation order, either using annotations or continuation passing style. That way you get to control when things are evaluated.

In your example you might calculate the numbers "bottom up" and pass the previous two numbers along to each iteration:

fib n = fib_iter(1,1,n)
    where
      fib_iter(a,b,0) = a
      fib_iter(a,b,1) = a
      fib_iter(a,b,n) = fib_iter(a+b,a,n-1)

This results in a linear time algorithm.

Whenever you have a dynamic algorithm where each result relies on the N previous results, you can use this technique. Otherwise you might have to use an array or something completely different.

Jørgen Fogh 2010-01-14 15:46:03

`fib_iter(1,1,n)` -> `fib_iter 1 1 n` (I guess it would work with extra tupling, but it would be pointless and definately nonidiomatic.)

Michał Marczyk 2010-01-14 16:28:14

I had been intending to read Okasaki's book. I didn't know that a) it was his thesis, and b) that was on-line. It certainly looks promising ("the notion of amortization also provides the first practical techniques for analyzing the time requirements of non-trivial lazy programs"). Thanks!

Aidan Cully 2010-01-14 20:30:08

Answer 3

+2 A:

Your implementation of fib2 uses memoization but each time you call fib2 it rebuild the "whole" result. Turn on ghci time and size profiling:

Prelude> :set +s

If it was doing memoisation "between" calls the subsequent calls would be faster and use no memory. Call fib2 20000 twice and see for yourself.

By comparison a more idiomatic version where you define the exact mathematical identity:

-- the infinite list of all fibs numbers.
fibs = 1 : 1 : zipWith (+) fibs (tail fibs)

memoFib n = fibs !! n

actually do use memoisation, explicit as you see. If you run memoFib 20000 twice you'll see the time and space taken the first time then the second call is instantaneous and take no memory. No magic and no implicit memoization like a comment might have hinted at.

Now about your original question: optimizing and reasoning about performance in Haskell...

I wouldn't call myself an expert in Haskell, I have only been using it for 3 years, 2 of which at my workplace but I did have to optimize and get to understand how to reason somewhat about its performance.

As mentionned in other post laziness is your friend and can help you gain performance however YOU have to be in control of what is lazily evaluated and what is strictly evaluated.

Check this comparison of foldl vs foldr

foldl actually stores "how" to compute the value i.e. it is lazy. In some case you saves time and space beeing lazy, like the "infinite" fibs. The infinite "fibs" doesn't generate all of them but knows how. When you know you will need the value you might as well just get it "strictly" speaking... That's where strictness annotation are usefull, to give you back control.

I recall reading many times that in lisp you have to "minimize" consing.

Understanding what is stricly evaluated and how to force it is important but so is understanding how much "trashing" you do to the memory. Remember Haskell is immutable, that means that updating a "variable" is actually creating a copy with the modification. Prepending with (:) is vastly more efficient than appending with (++) because (:) does not copy memory contrarily to (++). Whenever a big atomic block is updated (even for a single char) the whole block needs to be copied to represent the "updated" version. The way you structure data and update it can have a big impact on performance. The ghc profiler is your friend and will help you spot these. Sure the garbage collector is fast but not having it do anything is faster!

Cheers

JFT 2010-01-15 04:22:18

Answer 4

+1 A:

Since allocation is a major cost in any functional language, an important part of understanding performance is to understand when objects are allocated, how long they live, when they die, and when they are reclaimed. To get this information you need a heap profiler. It's an essential tool, and luckily GHC ships with a good one.

For more information, read Colin Runciman's papers.

Norman Ramsey 2010-01-15 05:42:40

Answer 5

+1 A:

Aside from the memoization issue, fib1 also uses non-tailcall recursion. Tailcall recursion can be re-factored automatically into a simple goto and perform very well, but the recursion in fib1 cannot be optimized in this way, because you need the stack frame from each instance of fib1 in order to calculate the result. If you rewrote fib1 to pass a running total as an argument, thus allowing a tail call instead of needing to keep the stack frame for the final addition, the performance would improve immensely. But not as much as the memoized example, of course :)

Justin Smith 2010-01-19 03:35:48

ansaurus

tags:

views:

answers:

Reasoning about performance in Haskell

related questions