Say we define
naivemean l = sum l / fromIntegral (length l)
It has a couple of serious limitations. First, the definition does not cover lists of Int
:
*Main> naivemean ([1] :: [Int])
<interactive>:1:0:
No instance for (Fractional Int)
arising from a use of `naivemean' at <interactive>:1:0-21
Possible fix: add an instance declaration for (Fractional Int)
In the expression: naivemean ([1] :: [Int])
In the definition of `it': it = naivemean ([1] :: [Int])
Second, it blows the stack for large lists:
*Main> naivemean [1..1000000]
*** Exception: stack overflow
Also, it makes two passes, one for sum
and one for length
, over the list when a single pass will do.
We can correct all three problems with
{-# LANGUAGE BangPatterns #-}
mean :: (Real a, Fractional b) => [a] -> b
mean = go (toRational 0) 0
where
go !s !l [] = fromRational s / fromIntegral l
go s l (x:xs) = go (s+.x) (l+1) xs
s +. x = s + toRational x
Bang patterns force strict evaluation of the marked parameters. Without the bangs, the code above will also blow the stack when given a long list, but for a different reason: lazy evaluation of l
for example generates a long unevaluated chain of the form
0 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + ...
In this case, lazy evaluation is creating more work in the form of allocating all the suspended computations than it would be to simply increment the counter on every iteration.
As a beginner, fromRational
and toRational
are probably confusing too. Consider the type of the division function:
*Main> :t (/)
(/) :: (Fractional a) => a -> a -> a
That means division is defined for any two values of the same type that is an instance of Fractional. Int
is not one of those types:
*Main> (1::Int) / (2::Int)
<interactive>:1:0:
No instance for (Fractional Int)
arising from a use of `/' at <interactive>:1:0-18
Possible fix: add an instance declaration for (Fractional Int)
In the expression: (1 :: Int) / (2 :: Int)
In the definition of `it': it = (1 :: Int) / (2 :: Int)
One definition of mean
ought to cover all of [Int]
, [Float]
, and [Double]
but without the rational bits (and without the type annotation), the type inferred for mean
is
*Main> :t mean
mean :: [Double] -> Double
because of the division by the length of the list.
It turns out that Int
, Float
, and Double
are instances of the typeclass Real
,and any Real
may be converted to Rational
*Main> :t toRational
toRational :: (Real a) => a -> Rational
and Rational
may be converted to Fractional
:
*Main> :t fromRational
fromRational :: (Fractional a) => Rational -> a
Finally, for large lists, there's also a chance that we could overflow a machine double, but Rational
gives us arbitrary precision.
If you have a background in languages such as C or Java that promote types automatically to handle these cases, you'll find this particular inflexibility of Haskell's type system to be confusing and frustrating. I'm afraid you just have to learn to deal with it.
Having done all that, we can now
*Main> mean ([1..1000000] :: [Int])
500000.5
or
*Main> mean ([1..1000000] :: [Double])
500000.5