views:

2438

answers:

5

I wrote this snippet of code and I assume len is tail-recursive, but a stack overflow still occurs. What is wrong?

myLength :: [a] -> Integer

myLength xs = len xs 0
    where len [] l = l
          len (x:xs) l = len xs (l+1)

main = print $ myLength [1..10000000]
+9  A: 

Seems like laziness causes len to build thunk:

len [1..100000] 0
-> len [2..100000] (0+1)
-> len [3..100000] (0+1+1)

and so on. You must force len to reduce l every time:

len (x:xs) l = l `seq` len xs (l+1)

For more information, look http://haskell.org/haskellwiki/Stack_overflow.

mattiast
I can't find what `seq` do.
Hynek -Pichi- Vychodil
Heh, I found it, it force l to evaluate so in next recursion thunk (l+1) is evaluated before next len apply. It's not so much easy to read and understand.
Hynek -Pichi- Vychodil
In fact that page links to a page which answers the question exactly: http://haskell.org/haskellwiki/Performance/Accumulating_parameter .
slim
+22  A: 

Remember that Haskell is lazy. Your computation (l+1) will not occur until it's absolutely necessary.

The 'easy' fix is to use '$!' to force evaluation:

myLength :: [a] -> Integer
myLength xs = len xs 0
where len [] l = l
      len (x:xs) l = len xs $! (l+1)

      main = print $ myLength [1..10000000]
eelco
A: 

Just so you know, there's a much easier way to write this function:

myLength xs = foldl step 0 xs where step acc x = acc + 1

Alex

Alex Fort
myLength = foldl (+) 0 is also the same, although it takes more sophisticated reasoning to prove it. The optimizer will make it efficient, so there's no need to explicitly tail recurse.
luqui
You are not right: *Main> foldl (+) 0 [1..1000000]*** Exception: stack overflow
Hynek -Pichi- Vychodil
And wrong result also, you sum list instead *Main> foldl (+) 0 [1..10]=> 55
Hynek -Pichi- Vychodil
That can be fixed using foldl', which is strict version of foldl.
mattiast
+3  A: 

The foldl carries the same problem; it builds a thunk. You can use foldl' from Data.List to avoid that problem:

import Data.List
myLength = foldl' (const.succ) 0

The only difference between foldl and foldl' is the strict accumulation, so foldl' solves the problem in the same way as the seq and $! examples above. (const.succ) here works the same as (\a b -> a+1), though succ has a less restrictive type.

+1  A: 

eelco.lempsink.nl answers the question you asked. Here's a demonstration of Yann's answer:

module Main
    where

import Data.List
import System.Environment (getArgs)

main = do
  n <- getArgs >>= readIO.head
  putStrLn $ "Length of an array from 1 to " ++ show n
               ++ ": " ++ show (myLength [1..n])

myLength :: [a] -> Int
myLength = foldl' (const . succ) 0

foldl' goes through the list from left to right each time adding 1 to an accumulator which starts at 0.

Here's an example of running the program:


C:\haskell>ghc --make Test.hs -O2 -fforce-recomp
[1 of 1] Compiling Main             ( Test.hs, Test.o )
Linking Test.exe ...

C:\haskell>Test.exe 10000000 +RTS -sstderr
Test.exe 10000000 +RTS -sstderr

Length of an array from 1 to 10000000: 10000000
     401,572,536 bytes allocated in the heap
          18,048 bytes copied during GC
           2,352 bytes maximum residency (1 sample(s))
          13,764 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:   765 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     1 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.27s  (  0.28s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.27s  (  0.28s elapsed)

  %GC time       0.0%  (0.7% elapsed)

  Alloc rate    1,514,219,539 bytes per MUT second

  Productivity 100.0% of total user, 93.7% of total elapsed


C:\haskell>
Michael Steele