views:

271

answers:

5

Let's consider a data type with many constructors:

data T = Alpha Int | Beta Int | Gamma Int Int | Delta Int

I want to write a function to check if two values are produced with the same constructor:

sameK (Alpha _) (Alpha _) = True
sameK (Beta _) (Beta _) = True
sameK (Gamma _ _) (Gamma _ _) = True
sameK _ _ = False

Maintaining sameK is not much fun, it cannot be checked for correctness easily. For example, when new constructors are added to T, it's easy to forget to update sameK. I omitted one line to give an example:

-- it’s easy to forget:
-- sameK (Delta _) (Delta _) = True

The question is how to avoid boilerplate in sameK? Or how to make sure it checks for all T constructors?


The workaround I found is to use separate data types for each of the constructors, deriving Data.Typeable, and declaring a common type class, but I don't like this solution, because it is much less readable and otherwise just a simple algebraic type works for me:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Typeable

class Tlike t where
  value :: t -> t
  value = id

data Alpha = Alpha Int deriving Typeable
data Beta = Beta Int deriving Typeable
data Gamma = Gamma Int Int deriving Typeable
data Delta = Delta Int deriving Typeable

instance Tlike Alpha
instance Tlike Beta
instance Tlike Gamma
instance Tlike Delta

sameK :: (Tlike t, Typeable t, Tlike t', Typeable t') => t -> t' -> Bool
sameK a b = typeOf a == typeOf b
+1  A: 

In some cases, "Scrap Your Boilerplate" library will help.

http://www.haskell.org/haskellwiki/Scrap_your_boilerplate

SK-logic
Thank you, I'll look into this approach.
jetxee
I assume I can start with these papers: http://research.microsoft.com/en-us/um/people/simonpj/papers/hmap/
jetxee
+11  A: 

Another possible way:

sameK x y = f x == f y
  where f (Alpha _)   = 0
        f (Beta _)    = 1
        f (Gamma _ _) = 2
        -- runtime error when Delta value encountered

A runtime error is not ideal, but better than silently giving the wrong answer.

Dave Hinton
I like it. GHC with `-Wall` kindly reports non-exaustive pattern matches in the definition of `f', so runtime error may be prevented. Thank you for the idea.
jetxee
+8  A: 

You'll need to use a generics library like Scrap Your Boilerplate or uniplate to do this in general.

If you don't want to be so heavy-handed, you can use Dave Hinton's solution, together with the empty record shortcut:

...
where f (Alpha {}) = 0
      f (Beta {}) = 1
      f (Gamma {}) = 2

So you don't have to know how many args each constructor has. But it obviously still leaves something to be desired.

luqui
Nice trick with `{}`. I didn't know about it. Thank you.
jetxee
Record braces bind stronger than anything else, so technically you don't need the parentheses. `f Alpha {} = 0` will work fine, although I'm not sure how readable that is, it sort of looks like `f` takes two arguments. I sometimes dabble with `f Alpha{} = 0`...
Tom Lokhorst
+5  A: 

Look at the Data.Data module, the toConstr function in particular. Along with {-# LANGUAGE DeriveDataTypeable #-} that will get you a 1-line solution which works for any type which is an instance of Data.Data. You don't need to figure out all of SYB!

If, for some reason (stuck with Hugs?), that is not an option, then here is a very ugly and very slow hack. It works only if your datatype is Showable (e.g. by using deriving (Show) - which means no function types inside, for example).

constrT :: T -> String
constrT = head . words . show
sameK x y = constrT x == constrT y

constrT gets the string representation of the outermost constructor of a T value by showing it, chopping it up into words and then getting the first. I give an explicit type signature so you're not tempted to use it on other types (and to evade the monomorphism restriction).

Some notable disadvantages:

  • This breaks horribly when your type has infix constructors (such as data T2 = Eta Int | T2 :^: T2)
  • If some of your constructors have a shared prefix, this is going to get slower, as a larger part of the strings has to be compared.
  • Doesn't work on types with a custom show, such as many library types.

That said, it is Haskell 98... but that's about the only nice thing I can say about it!

yatima2975
+1 for `toConstr`
Reid Barton
Wow! It's just magic. Thanks a lot!
jetxee
A: 

You can definitely use the generics to eliminate the boilerplate. Your code is a textbook example why I (and many others never use the _ wildcard at top level). While it is tedious to write out all the cases, it is less tedious than dealing with the bugs.

In this happy example I would not only use Dave Hinton's solution but would slap an INLINE pragma on the auxiliary function f.

Norman Ramsey
Yes, that's why I am asking. Thank you for advice.
jetxee