views:

215

answers:

5

I wanted to test foldl vs foldr. From what I've seen you should use foldl over foldr when ever you can due to tail reccursion optimization.

This makes sense. However, after running this test I am confused:

foldr (takes 0.057s when using time command):

a::a -> [a] -> [a]
a x = ([x] ++ )

main = putStrLn(show ( sum (foldr a [] [0.. 100000])))

foldl (takes 0.089s when using time command):

b::[b] -> b -> [b]
b xs = ( ++ xs). (\y->[y])

main = putStrLn(show ( sum (foldl b [] [0.. 100000])))

It's clear that this example is trivial, but I am confused as to why foldr is beating foldl. Shouldn't this be a clear case where foldl wins?

+1  A: 

Well, let me rewrite your functions in a way that difference should be obvious -

a :: a -> [a] -> [a]
a = (:)

b :: [b] -> b -> [b]
b = flip (:)

You see that b is more complex than a. If you want to be precise a needs one reduction step for value to be calculated, but b needs two. That makes the time difference you are measuring, in second example twice as much reductions must be performed.

//edit: But time complexity is the same, so I wouldn't bother about it much.

Matajon
I tried changing it to `a = flip $ flip (:)` and that didn't change the execution time noticeably, so I don't think that flipping the arguments to accommodate foldl is the problem.
Harold L
+9  A: 

Welcome to the world of lazy evaluation.

When you think about it in terms of strict evaluation, foldl looks "good" and foldr looks "bad" because foldl is tail recursive, but foldr would have to build a tower in the stack so it can process the last item first.

However, lazy evaluation turns the tables. Take, for example, the definition of the map function:

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (x:xs) = f x : map f xs

This wouldn't be too good if Haskell used strict evaluation, since it would have to compute the tail first, then prepend the item (for all items in the list). The only way to do it efficiently would be to build the elements in reverse, it seems.

However, thanks to Haskell's lazy evaluation, this map function is actually efficient. Lists in Haskell can be thought of as generators, and this map function generates its first item by applying f to the first item of the input list. When it needs a second item, it just does the same thing again (without using extra space).

It turns out that map can be described in terms of foldr:

map f xs = foldr (\x ys -> f x : ys) [] xs

It's hard to tell by looking at it, but lazy evaluation kicks in because foldr can give f its first argument right away:

foldr f z []     = z
foldr f z (x:xs) = f x (foldr f z xs)

Because the f defined by map can return the first item of the result list using solely the first parameter, the fold can operate lazily in constant space.

Now, lazy evaluation does bite back. For instance, try running sum [1..1000000]. It yields a stack overflow. Why should it? It should just evaluate from left to right, right?

Let's look at how Haskell evaluates it:

foldl f z []     = z
foldl f z (x:xs) = foldl f (f z x) xs

sum = foldl (+) 0

sum [1..1000000] = foldl (+) 0
                 = foldl (+) ((+) 0 1) [2..1000000]
                 = foldl (+) ((+) ((+) 0 1) 2) [3..1000000]
                 = foldl (+) ((+) ((+) ((+) 0 1) 2) 3) [4..1000000]
                   ...

Haskell is too lazy to go ahead and perform the additions, so this yields a stack overflow. Fortunately, there is a special function in Data.List called foldl' that operates strictly. foldl' (+) 0 [1..1000000] will not stack overflow.

Note: I tried replacing foldl with foldl' in your test, but it actually made it run slower.

Joey Adams
A: 

Neither foldl nor foldr is tail optimized. It is only foldl'.

But in your case using ++ with foldl' is not good idea because successive evaluation of ++ will cause traversing growing accumulator again and again.

Hynek -Pichi- Vychodil
A: 

For a, the [0.. 100000] list needs to be expanded right away so that foldr can start with the last element. Then as it folds things together, the intermediate results are

[100000]
[99999, 100000]
[99998, 99999, 100000]
...
[0.. 100000] -- i.e., the original list

Because nobody is allowed to change this list value (Haskell is a pure functional language), the compiler is free to reuse the value. The intermediate values, like [99999, 100000] can even be simply pointers into the expanded [0.. 100000] list instead of separate lists.

For b, look at the intermediate values:

[0]
[0, 1]
[0, 1, 2]
...
[0, 1, ..., 99999]
[0.. 100000]

Each of those intermediate lists can't be reused, because if you change the end of the list then you've changed any other values that point to it. So you're creating a bunch of extra lists that take time to build in memory. So in this case you spend a lot more time allocating and filling in these lists that are intermediate values.

Since you're just making a copy of the list, a runs faster because it starts by expanding the full list and then just keeps moving a pointer from the back of the list to the front.

Harold L
+5  A: 

EDIT: Upon looking at this problem again, I think all current explanations are somewhat insufficient so I've written a longer explanation.

The difference is in how foldl and foldr apply their reduction function. Looking at the foldr case, we can expand it as

foldr (\x -> [x] ++ ) [] [0..10000]
[0] ++ foldr a [] [1..10000]
[0] ++ ([1] ++ foldr a [] [2..10000])
...

This list is processed by sum, which consumes it as follows:

sum = foldl' (+) 0
foldl' (+) 0 ([0] ++ ([1] ++ ... ++ [10000]))
foldl' (+) 0 (0 : [1] ++ ... ++ [10000])     -- get head of list from '++' definition
foldl' (+) 0 ([1] ++ [2] ++ ... ++ [10000])  -- add accumulator and head of list
foldl' (+) 0 (1 : [2] ++ ... ++ [10000])
foldl' (+) 1 ([2] ++ ... ++ [10000])
...

I've left out the details of the list concatenation, but this is how the reduction proceeds. The important part is that everything gets processed in order to minimize list traversals. The foldr only traverses the list once, the concatenations don't require continuous list traversals, and sum finally consumes the list in one pass. Critically, the head of the list is available from foldr immediately to sum, so sum can begin working immediately and values can be gc'd as they are generated. With fusion frameworks such as vector, even the intermediate lists will likely be fused away.

Contrast this to the foldl function:

b xs = ( ++xs) . (\y->[y])
foldl b [] [0..10000]
foldl b ( [0] ++ [] ) [1..10000]
foldl b ( [1] ++ ([0] ++ []) ) [2..10000]
foldl b ( [2] ++ ([1] ++ ([0] ++ [])) ) [3..10000]
...

Note that now the head of the list isn't available until foldl has finished. This means that the entire list must be constructed in memory before sum can begin to work. This is much less efficient overall. Running the two versions with +RTS -s shows miserable garbage collection performance from the foldl version.

This is also a case where foldl' will not help. The added strictness of foldl' doesn't change the way the intermediate list is created. The head of the list remains unavailable until foldl' has finished, so the result will still be slower than with foldr.

I use the following rule to determine the best choice of fold

  • For folds that are a reduction, use foldl' (e.g. this will be the only/final traversal)
  • Otherwise use foldr.
  • Don't use foldl.

In most cases foldr is the best fold function because the traversal direction is optimal for lazy evaluation of lists. It's also the only one capable of processing infinite lists. The extra strictness of foldl' can make it faster in some cases, but this is dependent on how you'll use that structure and how lazy it is.

John
Unfortunately, `sum` uses foldl, not foldl' (unless they fixed that recently).
Joey Adams
Wishful thinking on my part. The argument still stands; you just need to replace the accumulator with a giant thunk.
John