views:

245

answers:

3

In Haskell, is it possible to write a function with a signature that can accept two different (although similar) data types, and operate differently depending on what type is passed in?

An example might make my question clearer. If I have a function named myFunction, and two types named MyTypeA and MyTypeB, can I define myFunction so that it can only accept data of type MyTypeA or MyTypeB as its first parameter?

type MyTypeA = (Int, Int, Char, Char)
type MyTypeB = ([Int], [Char])

myFunction :: MyTypeA_or_MyTypeB -> Char
myFunction constrainedToTypeA = something
myFunction constrainedToTypeB = somethingElse

In an OOP language, you could write what I'm trying to achieve like so:

public abstract class ConstrainedType {
}

public class MyTypeA extends ConstrainedType {
    ...various members...
}

public class MyTypeB extends ConstrainedType {
    ...various members...
}

...

public Char myFunction(ConstrainedType a) {
    if (a TypeOf MyTypeA) {
        return doStuffA();
    }
    else if (a TypeOf MyTypeB) {
        return doStuffB();
    }
}

I've been reading about algebraic data types and I think I need to define a Haskell type, but I'm not sure how to go about defining it so that it can store one type or another, and also how I use it in my own functions.

+3  A: 

It sounds like you might want to read up on typeclasses.

Jack Kelly
+21  A: 

Yes, you are correct, you are looking for algebraic data types. There is a great tutorial on them at Learn You a Haskell.

For the record, the concept of an abstract class from OOP actually has three different translations into Haskell, and ADTs are just one. Here is a quick overview of the techniques.

Algebraic Data Types

Algebraic data types encode the pattern of an abstract class whose subclasses are known, and where functions check which particular instance the object is a member of by down-casting.

abstract class IntBox { }

class Empty : IntBox { }

class Full : IntBox {
    int inside;
    Full(int inside) { this.inside = inside; }
}

int Get(IntBox a) {
    if (a is Empty) { return 0; }
    if (a is Full)  { return ((Full)a).inside; }
    error("IntBox not of expected type");
}

Translates into:

data IntBox = Empty | Full Int

get :: IntBox -> Int
get Empty = 0
get (Full x) = x

Record of functions

This style does not allow down-casting, so the Get function above would not be expressible in this style. So here is something completely different.

abstract class Animal { 
    abstract string CatchPhrase();
    virtual void Speak() { print(CatchPhrase()); }
}

class Cat : Animal {
    override string CatchPhrase() { return "Meow"; }
}

class Dog : Animal {
    override string CatchPhrase() { return "Woof"; }
    override void Speak() { print("Rowwrlrw"); }
}

Its translation in Haskell doesn't map types into types. Animal is the only type, and Dog and Cat are squashed away into their constructor functions:

data Animal = Animal {
    catchPhrase :: String,
    speak       :: IO ()
}

protoAnimal :: Animal
protoAnimal = Animal {
    speak = putStrLn (catchPhrase protoAnimal)
}

cat :: Animal
cat = protoAnimal { catchPhrase = "Meow" }

dog :: Animal
dog = protoAnimal { catchPhrase = "Woof", speak = putStrLn "Rowwrlrw" }

There are a few different permutations of this basic concept. The invariant is that the abstract type is a record type where the methods are the fields of the record.

EDIT: There is a good discussion in the comments on some of the subtleties of this approach, including a bug in the above code.

Typeclasses

This is my least favorite encoding of OO ideas. It is comfortable to OO programmers because it uses familiar words and maps types to types. But the record of functions approach above tends to be easier to work with when things get complicated.

I'll encode the Animal example again:

class Animal a where
    catchPhrase :: a -> String
    speak       :: a -> IO ()

    speak a = putStrLn (catchPhrase a)

data Cat = Cat 
instance Animal Cat where
    catchPhrase Cat = "Meow"

data Dog = Dog
instance Animal Dog where
    catchPhrase Dog = "Woof"
    speak Dog = putStrLn "Rowwrlrw"

This looks nice, doesn't it? The difficulty comes when you realize that even though it looks like OO, it doesn't really work like OO. You might want to have a list of Animals, but the best you can do right now is Animal a => [a], a list of homogeneous animals, eg. a list of only Cats or only Dogs. Then you need to make this wrapper type:

data AnyAnimal = forall a. Animal a => AnyAnimal a
instance Animal AnyAnimal where
    catchPhrase (AnyAnimal a) = catchPhrase a
    speak (AnyAnimal a) = speak a

And then [AnyAnimal] is what you want for your list of animals. However, it turns out that AnyAnimal exposes exactly the same information about itself as the Animal record in the second example, we've just gone about it in a roundabout way. Thus why I don't consider typeclasses to be a very good encoding of OO.

And thus concludes this week's edition of Way Too Much Information!

luqui
+1, very good explanation with nice examples
Landei
Your record of functions example contains a bug that conveniently illustrates why this encoding of OO is also somewhat awkward. If I run `speak cat` from that example, I get an exception: `Missing field in record construction Main.catchPhrase`. OO languages and typeclasses tie the recursive knot for you, but this must be done manually for "methods" held in records. So we should have `speak :: Animal -> IO ()`, with `putStrLn . catchPhrase` as its prototypical definition and `speak cat cat` as the unfortunate invocation syntax.
dhaffey
@dhaffey, Oh! Good observation, and it's obvious now that I see it. I guess non-typeclass thing to do is use the open fixpoint pattern. So `animalClass :: Animal -> Animal`, and then definitions begin `dog this =`, and we have `instantiate = fix`.
luqui
An excellent answer, very informative! Algebraic data types do seem to be what I need for the program I'm writing and seem to work quite well.
ultrafez
@dhaffey: Another option is to make the constructor recursive and use a specialized fixpoint operator to tie the knot at time of construction. A type alias `type New t = t -> t` is handy too. So: `protoAnimal :: New Animal`, `protoAnimal self = Animal {speak = putStrLn (catchPhrase self)}`, and `dog self = (protoAnimal self) { catchPhrase = "Woof", speak = putStrLn "Rowwrlrw" }`. With `new init = self where self = init self`, the 'knot' would be properly tied by `fido = new dog`, and the methods behave as expected. I believe this can all be generalized to 'monadic' constructors as well.
mokus
@myself: Using a `Monad` instance for `(->) t` (available in `Control.Monad.Instances`), the `cat` and `dog` constructors can be made to look a little more conventional as well: `dog = do self <- protoAnimal; return self { catchPhrase = "Woof", speak = putStrLn "Rowwrlrw" }`
mokus
A: 

typeclasses in haskell are what interfaces in java are. more than that: you can define multi-classes and default implementations and define those interface(class)-objecttype(data/newtype/type)-relations(instance) independant of each other.

class MyTypeA_or_MyTypeB mytype where
    myFunction :: mytype -> Char

type MyTypeA = (Int, Int, Char, Char)
instance MyTypeA_or_MyTypeB MyTypeA where
    myFunction constrainedToTypeA = something

type MyTypeB = ([Int], [Char])
instance MyTypeA_or_MyTypeB MyTypeB where
    myFunction constrainedToTypeB = somethingElse

f :: (MyTypeA_or_MyTypeB mytype) => (mytype,mytype) -> String
f (x,y) = myFunction x : myFunction y : ""

[edit]

a complete other way could be to use the Either type. you find it in the Prelude and in Data.Either:

data Either a b = Left a | Right b

myFunction :: Either MyTypeA MyTypeB -> Foo
myFunction (Left myA) = bar
myFunction (Right myB) = baz

of course, you could define your own type, too. but for that, it would be good to know what you really want.

comonad
This is masturbation. If all you know about your argument is that it is a MyTypeA_or_MyTypeB, then all you know about it is that it can be converted to a Char. Why not just have your function take a Char instead?
luqui
@luqui: because it is just an example to show how to use it, nothing more, nothing less. how do you relate m***tion to it?! are you really the same person as above? interfaces in OOP equal single-parameter type classes without any higher order types; an example should explain enough.
comonad
As luqui said, as much as it's an interesting example it doesn't allow me to differentiate between the two original types in the function.
ultrafez