views:

434

answers:

5

Is it recommended to always have exhaustive pattern matches in Haskell, even for "impossible" cases?

For example, in the following code, I am pattern matching on the "accumulator" of a foldr. I am in complete control of the contents of the accumulator, because I create it (it is not passed to me as input, but rather built within my function). Therefore, I know certain patterns should never match it. If I strive to never get the "Pattern match(es) are non-exhaustive" error, then I would place a pattern match for it that simply error's with the message "This pattern should never happen." Much like an assert in C#. I can't think of anything else to do there.

What practice would you recommend in this situation and why?

Here's the code:

gb_groupBy p input = foldr step [] input
   where
      step item acc = case acc of
           []                           -> [[item]]
           ((x:xs):ys)                  -> if p x item
                                           then (item:x:xs):ys
                                           else [item]:acc

The pattern not matched (as reported by the interpreter) is:

Warning: Pattern match(es) are non-exhaustive In a case alternative: Patterns not matched: [] : _

+7  A: 

This is probably more a matter of style than anything else. Personally, I would put in a

_ -> error "Impossible! Empty list in step"

if only to silence the warning :)

bdonlan
I agree with this approach. It helps indicate that you intentionally did not handle the other cases because you think it's impossible, rather than you just forgetting or not realizing.
newacct
It's always fun when your program suddenly terminates with this descriptive message: *** Exception: The impossible happened!
Tom Lokhorst
Yeah, reminds me of when I was very new to programming, and I'd have comments like "You should never reach this line". :)
Charlie Flowers
+3  A: 

You can resolve the warning in this special case by doing this:

gb_groupBy p input = foldr step [] input
   where
     step item acc = case acc of
        []                           -> [[item]]
        (xs:xss)                      -> if p (head xs) item
                                         then  (item:xs):xss
                                         else [item]:acc

The pattern matching is then complete, and the "impossible" condition of an empty list at the head of the accumulator would cause a runtime error but no warning.

Another way of looking at the more general problem of incomplete pattern matchings is to see them as a "code smell", i.e. an indication that we're trying to solve a problem in a suboptimal, or non-Haskellish, way, and try to rewrite our functions.

Implementing groupBy with a foldr makes it impossible to apply it to an infinite list, which is a design goal that the Haskell List functions try to achieve wherever semantically reasonable. Consider

take 5 $ groupBy (==) someFunctionDerivingAnInfiniteList

If the first 5 groups w.r.t. equality are finite, lazy evaluation will terminate. This is something you can't do in a strictly evaluated language. Even if you don't work with infinite lists, writing functions like this will yield better performance on long lists, or avoid the stack overflow that occurs when evaluating expressions like

take 5 $ gb_groupBy (==) [1..1000000]

In List.hs, groupBy is implemented like this:

groupBy      :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []    =  []
groupBy eq (x:xs)   =  (x:ys) : groupBy eq zs
                           where (ys,zs) = span (eq x) xs

This enables the interpreter/compiler to evaluate only the parts of the computation necessary for the result. span yields a pair of lists, where the first consists of (consecutive) elements from the head of the list all satisfying a predicate, and the second is the rest of the list. It's also implemented to work on infinite lists.

Christoph
+2  A: 

To follow up on my earlier comment, I realised that there is a way to acknowledge the missing case but still get a useful error with file/line number. It's not ideal as it'll only appear in unoptimized builds, though (see here).

...
[]:xs -> assert False (error "unreachable because I know everything")
Ganesh Sittampalam
Nice. Thanks for the follow-up.
Charlie Flowers
+4  A: 

I find exhaustiveness checking on case patterns indispensible. I try never to use _ in a case at top level, because _ matches everything, and by using it you vitiate the value of exhaustiveness checking. This is less important with lists but critical important with user-defined algebraic data types, because I want to be able to add a new constructor and have the compiler barf on all the missing cases. For this reason I always compile with -Werror turned on, so there is no way I can leave out a case.

As observed, your code can be extended with this case

[] : _ -> error "this can't happen"

Internally, GHC has a panic function, which unlike error will give source coordinates, but I looked at the implementation and couldn't make head or tail of it.

Norman Ramsey
I really like that as a guideline. When programming in C#, I've often wished for something like "exhaustiveness checking". For example, if I had an enum, and a case statement to handle each enum member, I wanted to be able to tell the compiler "make sure I've covered all the cases". I had no idea that such a concept would be part of the foundation of Haskell and other functional langs.
Charlie Flowers
Oh yes, I always feel guilty when using enums and switch. But its so much more concise than creating a whole class hierarchy and splitting up one serious method into many tiny virtual methods.
SealedSun
Yeah. I'm a big believer in polymorhism, but even aside from that, there are cases where you want to "fork" on a value by breaking it down into subcategories (less than 3, 3 or higher but odd, and 3 or higher but even). If you express it incorrectly, you have "boundary" errors. But the compiler could easily catch that for you if it knew you were trying to cover all the possible ranges. And that's one of the things pattern matching in a lot of the functional languages does.
Charlie Flowers
Wow, Norman, it's almost a year later and only now am I getting your pun. I guess the first time I read it I didn't make head or tail of it :)
Charlie Flowers
A: 

The type system is your friend, and the warning is letting you know your function has cracks. The very best approach is to go for a cleaner, more elegant fit between types.

Consider ghc's definition of groupBy:

groupBy                 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []           =  []
groupBy eq (x:xs)       =  (x:ys) : groupBy eq zs
                           where (ys,zs) = span (eq x) xs
Greg Bacon