views:

825

answers:

3

I'm trying to understand how Haskell list comprehensions work "under the hood" in regards to pattern matching. The following ghci output illustrates my point:

Prelude> let myList = [Just 1, Just 2, Nothing, Just 3]
Prelude> let xs = [x | Just x <- myList]
Prelude> xs
[1,2,3]
Prelude>

As you can see, it is able to skip the "Nothing" and select only the "Just" values. I understand that List is a monad, defined as (source from Real World Haskell, ch. 14):

instance Monad [] where
    return x = [x]
    xs >>= f = concat (map f xs)
    xs >> f = concat (map (\_ -> f) xs)
    fail _ = []

Therefore, a list comprehension basically builds a singleton list for every element selected in the list comprehension and concatenates them. If a pattern match fails at some step, the result of the "fail" function is used instead. In other words, the "Just x" pattern doesn't match so [] is used as a placeholder until 'concat' is called. That explains why the "Nothing" appears to be skipped.

What I don't understand is, how does Haskell know to call the "fail" function? Is it "compiler magic", or functionality that you can write yourself in Haskell? Is it possible to write the following "select" function to work the same way as a list comprehension?

select :: (a -> b) -> [a] -> [b]
select (Just x -> x) myList       -- how to prevent the lambda from raising an error?
[1,2,3]
+1  A: 

I don't think the list comprehension syntax has much to do with the fact that List ([]), or Maybe for that matter, happens to be an instance of the Monad type class.

List comprehensions are indeed compiler magic or syntax sugar, but that's possible because the compiler knows the structure of the [] data type.

Here's what the list comprehension is compiled to: (Well, I think, I didn't actually check it against the GHC)

xs = let f = \xs -> case xs of
                      Just x -> [x]
                      _      -> []
     in concatMap f myList

As you can see, the compiler doesn't have to call the fail function, it can simply inline a empty list, because it knows what a list is.


Interestingly, this fact that the list comprehensions syntax 'skips' pattern match failures is used in some libraries to do generic programming. See the example in the Uniplate library.


Edit: Oh, and to answer your question, you can't call your select function with the lambda you gave it. It will indeed fail on a pattern match failure if you call it with an Nothing value.

You could pass it the f function from the code above, but than select would have the type:

select :: (a -> [b]) -> [a] -> [b]

which is perfectly fine, you can use the concatMap function internally :-)

Also, that new select now has the type of the monadic bind operator for lists (with its arguments flipped):

(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concatMap f xs -- 'or as you said: concat (map f xs)
Tom Lokhorst
Good explanation, thanks. What I read at http://en.wikibooks.org/wiki/Haskell/Pattern_matching about list comprehensions must be wrong then (it is a wiki, so I'm not surprised).
Cybis
Sorry I unaccepted your answer, but I think Porges's is more accurate because it explains why List moands behave so much like list comprehensions.
Cybis
+5  A: 

The rule for desugaring a list comprehension requires an expression of the form [ e | p <- l ] (where e is an expression, p a pattern, and l a list expression) behave like

let ok p = [e]
    ok _ = []
in concatMap ok l

Previous versions of Haskell had monad comprehensions, which were removed from the language because they were hard to read and redundant with the do-notation. (List comprehensions are redundant, too, but they aren't so hard to read.) I think desugaring [ e | p <- l ] as a monad (or, to be precise, as a monad with zero) would yield something like

let ok p = return e
    ok _ = mzero
in l >>= ok

where mzero is from the MonadPlus class. This is very close to

do { p <- l; return e }

which desugars to

let ok p = return e
    ok _ = fail "..."
in l >>= ok

When we take the List Monad, we have

return e = [e]
mzero = fail _ = []
(>>=) = flip concatMap

I.e., the 3 approaches (list comprehensions, monad comprehensions, do expressions) are equivalent for lists.

Chris Conway
'e' isn't necessarily of type List. Wouldn't it be "let ok p = [e]"? That would work. Regardless, if what you say is correct, then http://en.wikibooks.org/wiki/Haskell/Pattern_matching must be wrong.
Cybis
You're right, I made a mistake in the translation.
Chris Conway
The Report says, "List comprehensions satisfy these identities"... not "you must translate them this way". They just have to work as though you have :)
Porges
That's true, Porges. But "X desugars to an expression which is observationally equivalent to Y" is a little much... ;-)
Chris Conway
+11  A: 

While implemenatations of Haskell might not do it directly like this internally, it is helpful to think about it this way :)

[x | Just x <- myList]

... becomes:

do
    Just x <- myList
    return x

... which is:

myList >>= \(Just x) -> return x

As to your question:

What I don't understand is, how does Haskell know to call the "fail" function?

In do-notation, if a pattern binding fails (i.e. the Just x), then the fail method is called. For the above example, it would look something like this:

myList >>= \temp -> case temp of
    (Just x) -> return x
    _        -> fail "..."

So, every time you have a pattern-match in a monadic context that may fail, Haskell inserts a call to fail. Try it out with IO:

main = do
    (1,x) <- return (0,2)
    print x -- x would be 2, but the pattern match fails
Porges
Thank you! This response matches what I've been reading and explains the pattern-match behavior for types other than list. It hadn't occurred to me that the compiler might be using case statements like this to determine when to call fail. This totally makes sense.
Cybis