views:

372

answers:

5

Hi, I am a C++ / Java programmer and the main paradigm I happen to use in everyday programming is OOP. In some thread I read a comment that Type classes are more intuitive in nature than OOP. Can someone explain the concept of type classes in simple words so that an OOP guy like me can understand it?

+3  A: 

A type class can be compared to the concept of 'implementing' an interface. If some data type in Haskell implements the "Show" interface, it can be used with all functions that expect a "Show" object.

xtofl
... a bit like the lost 'concepts' in C++...
xtofl
+9  A: 

Well, the short version is: Type classes are what Haskell uses for ad-hoc polymorphism.

...but that probably didn't clarify anything for you.

Polymorphism should be a familiar concept to people from an OOP background. The key point here, however, is the difference between parametric and ad-hoc polymorphism.

Parametric polymorphism means functions that operate on a structural type that itself is parameterized by other types, such as a list of values. Parametric polymorphism is pretty much the norm everywhere in Haskell; C# and Java call it "generics". Basically, a generic function does the same thing to a specific structure, no matter what the type parameters are.

Ad-hoc polymorphism, on the other hand, means a collection of distinct functions, doing different (but conceptually related) things depending on types. Unlike parametric polymorphism, ad-hoc polymorphic functions need to be specified individually for each possible type they can be used with. Ad-hoc polymorphism is thus a generalized term for a variety of features found in other languages, such as function overloading in C/C++ or class-based dispatch polymorphism in OOP.

A major selling point of Haskell's type classes over other forms of ad-hoc polymorphism is greater flexibility due to allowing polymorphism anywhere in the type signature. For instance, most languages won't distinguish overloaded functions based on return type; type classes can.

Interfaces as found in many OOP languages are somewhat similar to Haskell's type classes--you specify a group of function names/signatures that you want to treat in an ad-hoc polymorphic fashion, then explicitly describe how various types can be used with those functions. Haskell's type classes are used similarly, but with greater flexibility: you can write arbitrary type signatures for the type class functions, with the type variable used for instance selection appearing anywhere you like, not just as the type of an object that methods are being called on.

Some Haskell compilers--including the most popular, GHC--offer language extensions that make type classes even more powerful, such as multi-parameter type classes, which let you do ad-hoc polymorphic function dispatch based on multiple types (similar to what's called "multiple dispatch" in OOP).


To try to give you a bit of the flavor of it, here's some vaguely Java/C#-flavored pseudocode:

interface IApplicative<>
{
    IApplicative<T> Pure<T>(T item);
    IApplicative<U> Map<T, U>(Function<T, U> mapFunc, IApplicative<T> source);
    IApplicative<U> Apply<T, U>(IApplicative<Function<T, U>> apFunc, IApplicative<T> source);
}

interface IReducible<>
{
    U Reduce<T,U>(Function<T, U, U> reduceFunc, U seed, IReducible<T> source);
}

Note that we're, among other things, defining an interface over a generic type and defining a method where the interface type appears only as the return type, Pure. Not apparent is that every use of the interface name should mean the same type (i.e., no mixing different types that implement the interface), but I wasn't sure how to express that.

camccann
+6  A: 

In addition to what xtofl and camccann have already written in their excellent answers, a useful thing to notice when comparing Java's interfaces to Haskell's type classes is the following:

  1. Java interfaces are closed, meaning that the set of interfaces any given class implements is decided once and for all when and where it is defined;

  2. Haskell's type classes are open, meaning that any type (or group of types for multi-parameter type classes) can be made a member of any type class at any time, as long as suitable definitions can be provided for the functions defined by the type class.

This openness of type classes (and Clojure's protocols, which are very similar) is a very useful property; it is quite usual for a Haskell programmer to come up with a new abstraction and immediately apply it to a range of problems involving pre-existing types through clever use of type classes.

Michał Marczyk
To be fair, type classes being open has downsides as well. But you have to push the system pretty far (and maybe use some GHC extensions) to really run into them.
camccann
Said downsides can be felt readily whenever you make an orphan instance. Say good-bye to modularity. I avoid orphan instances like the plague.
luqui
+8  A: 

In C++/etc, "virtual methods" are dispatched according to the type of the this/self implicit argument. (The method is pointed to in a function table which the object implicitly points to)

Type classes work differently, and can do everything that "interfaces" can and more. Let's start with a simple example of something that interfaces can't do: Haskell's Read type-class.

ghci> -- this is a Haskell comment, like using "//" in C++
ghci> -- and ghci is an interactive Haskell shell
ghci> 3 + read "5" -- Haskell syntax is different, in C: 3 + read("5")
8
ghci> sum (read "[3, 5]") -- [3, 5] is a list containing 3 and 5
8
ghci> -- let us find out the type of "read"
ghci> :t read
read :: (Read a) => String -> a

read's type is (Read a) => String -> a, which means for every type that implements the Read type-class, read can convert a String to that type. This is dispatch based on return type, impossible with "interfaces".

This can't be done in C++ et al's approach, where the function table is retrieved from the object - here, you don't even have the relevant object until after read returns it so how could you call it?

A key implementation difference from interfaces that allows this to happen, is that the function table isn't pointed to inside the object, it is passed separately by the compiler to the called functions.

Additionally, in C++/etc when one defines a class they are also responsible on implementing their interfaces. This means that you can't just invent a new interface and make Int or std::vector implement it.

In Haskell you can, and it isn't by "monkey patching" like in Ruby. Haskell has a good name-spacing scheme that means that two type classes can both have a function of the same name and a type can still implement both.

This allows Haskell to have many simple classes like Eq (types that support equality checking), Show (types that can be printed to a String), Read (types that can be parsed from a String), Monoid (types that have a concatenation operation and an empty element) and many more, and allows for even the primitive types like Int to implement the appropriate type-classes.

With the richness of type-classes people tend to program to more general types and then have more reusable functions and since they also have less freedom when the types are general they may even produce less bugs!

tldr: type-classes == awesome

yairchu
Well, I don't think your `sum (read "[3, 5]")` is a great example since although it works, `sum (read "[3.1, 5.2]")` fails with an error (even though of course `sum [3.1, 5.2]` succeeds). I suspect your example only works due to type defaulting rules, which are handy but frankly a bit of a hack.
j_random_hacker
+12  A: 

First, I am always very suspicious of claims that this or that program structure is more intuitive. Programming is counter-intuitive and always will be because people naturally think in terms of specific cases rather than general rules. Changing this requires training and practice, otherwise known as "learning to program".

Moving on to the meat of the question, the key difference between OO classes and Haskell typeclasses is that in OO a class (even an interface class) is both a type and a template for new types (descendants). In Haskell a typeclass is only a template for new types. More precisely, a typeclass describes a set of types that share a common interface, but it is not itself a type.

So the typeclass "Num" describes numeric types with addition, subtraction and multiplication operators. The "Integer" type is an instance of "Num", meaning that Integer is a member of the set of types that implement those operators.

So I can write a summation function with this type:

sum :: Num a => [a] -> a

The bit to to the left of the "=>" operator says that "sum" will work for any type "a" that is an instance of Num. The bit to the right says it takes a list of values of type "a" and returns a single value of type "a" as a result. So you could use it to sum a list of Integers or a list of Doubles or a list of Complex, because they are all instances of "Num". The implementation of "sum" will use the "+" operator of course, which is why you need the "Num" type constraint.

However you cannot write this:

sum :: [Num] -> Num

because "Num" is not a type.

This distinction between type and typeclass is why we don't talk about inheritance and descendants of types in Haskell. There is a sort of inheritance for typeclasses: you can declare one typeclass as a descendant of another. The descendant here describes a subset of the types described by parent.

An important consequence of all this is that you can't have heterogenous lists in Haskell. In the "sum" example you can pass it a list of integers or a list of doubles, but you cannot mix doubles and integers in the same list. This looks like a tricky restriction; how would you implement the old "cars and lorries are both types of vehicle" example? There are several answers depending on the problem you are actually trying to solve, but the general principle is that you do your indirection explicitly using first-class functions rather than implicitly using virtual functions.

Paul Johnson
Superclasses from contexts are a pretty sorry excuse for "inheritance", though, since they require you to manually write instances for all the parent classes. Not to mention bothersome issues like "why isn't Functor a superclass of Monad"...
camccann
Having come from an OO world before learning Haskell, I've come to think of inheritance as a pretty sorry excuse for first class functions.
Paul Johnson
Functor is a superclass of Monad. You are just using the wrong implementation of Monad.
jrockway