Having briefly looked at Haskell recently I wondered whether anybody could give a brief, succinct, practical explanation as to what a monad essentially is? I have found most explanations I've come across to be fairly inaccessible and lacking in practical detail, so could somebody here help me?
A monad is a thing used to encapsulate objects that have changing state. It is most often encountered in languages that otherwise do not allow you to have modifiable state (e.g., Haskell).
An example would be for file IO.
You would be able to use a monad for file IO to isolate the changing state nature to just the code that used the Monad. The code inside the Monad can effectively ignore the changing state of the world outside the Monad - this makes it a lot easier to reason about the overall effect of your program.
A monad is a datatype that has two operations: >>=
(aka bind
) and return
(aka unit
). return
takes an arbitrary value and creates an instance of the monad with it. >>=
takes an instance of the monad and maps a function over it. (You can see already that a monad is a strange kind of datatype, since in most programming languages you couldn't write a function that takes an arbitrary value and creates a type from it. Monads use a kind of parametric polymorphism.)
In Haskell notation, the monad interface is written
class Monad m where
return :: a -> m a
(>>=) :: forall a b . m a -> (a -> m b) -> m b
These operations are supposed to obey certain "laws", but that's not terrifically important: the "laws" just codify the way sensible implementations of the operations ought to behave (basically, that >>=
and return
ought to agree about how values get transformed into monad instances and that >>=
is associative).
Monads are not just about state and IO: they abstract a common pattern of computation that includes working with state, IO, exceptions, and non-determinism. Probably the simplest monads to understand are lists and option types:
instance Monad [ ] where
[] >>= k = []
(x:xs) >>= k = k x ++ (xs >>= k)
return x = [x]
instance Monad Maybe where
Just x >>= k = k x
Nothing >>= k = Nothing
return x = Just x
where []
and :
are the list constructors, ++
is the concatenation operator, and Just
and Nothing
are the Maybe
constructors. Both of these monads encapsulate common and useful patterns of computation on their respective data types (note that neither has anything to do with side effects or IO).
You really have to play around writing some non-trivial Haskell code to appreciate what monads are about and why they are useful.
This excellent video with Brian Beckman explains monads 'in terms you already know' and Brian assures you don't have to be scared by monads because of the way they look, because they are easy. I found his approach very educating and a good introduction to monads. Check it out.
If I've understood correctly, IEnumerable is derived from monads. I wonder if that might be an interesting angle of approach for those of us from the C# world?
For what it's worth, here are some links to tutorials that helped me (and no, I still haven't understood what monads are).
Actually, contrary to common understanding of Monads, they have nothing to do with state. Monads are simply a way to wrapping things and provide methods to do operations on the wrapped stuff without unwrapping it.
For example, you can create a type to wrap another one, in Haskell:
data Wrapped a = Wrap a
To wrap stuff we define
return :: a -> Wrapped a
return x = Wrap x
To perform operations without unwrapping, say you have a function f :: a -> b
, then you can do this to lift that function to act on wrapped values:
fmap :: (a -> b) -> (Wrapped a -> Wrapped b)
fmap f (Wrap x) = Wrap (f x)
That's about it there is to understand. However, it turns out that there is a more general function to do this lifting, which is bind
:
bind :: (a -> Wrapped b) -> (Wrapped a -> Wrapped b)
bind f (Wrap x) = f x
bind
can do a bit more than fmap
, but not vice versa. Actually, fmap
can be defined only in terms of bind
and return
. So, when defining a monad.. you give its type (here it was Wrapped a
) and then say how its return
and bind
operations work.
The cool thing is that this turns out to be such a general pattern that it pops up all over the place, encapsulating state in a pure way is only one of them.
For a good article on how monads can be used to introduce functional dependencies and thus control order of evaluation, like it is used in Haskell's IO monad, check out IO Inside.
As for understanding monads, don't worry too much about it. Read about them what you find interesting and don't worry if you don't understand right away. Then just diving in a language like Haskell is the way to go. Monads are one of these things where understanding trickles into your brain by practice, one day you just suddenly realize you understand them.
[Disclaimer: I am still trying to fully grok monads. The following is just what I have understood so far. If it’s wrong, hopefully someone knowledgeable will call me on the carpet.]
Arnar wrote:
Monads are simply a way to wrapping things and provide methods to do operations on the wrapped stuff without unwrapping it.
That’s precisely it. The idea goes like this:
You take some kind of value and wrap it with some additional information. Just like the value is of a certain kind (eg. an integer or a string), so the additional information is of a certain kind.
F.ex. that extra information might be a
Maybe
or anIO
.Then you have some operators that allow you to operate on the wrapped data while carrying along that additional information. These operators use the additional information to decide how to change the behaviour of the operation on the wrapped value.
F.ex., a
Maybe Int
can be aJust Int
orNothing
. Now, if you add aMaybe Int
to aMaybe Int
, the operator will check to see if they are bothJust Int
s inside, and if so, will unwrap theInt
s, pass them the addition operator, re-wrap the resultingInt
into a newJust Int
(which is a validMaybe Int
), and thus return aMaybe Int
. But if one of them was aNothing
inside, this operator will just immediately returnNothing
, which again is a validMaybe Int
. That way, you can pretend that yourMaybe Int
s are just normal numbers and perform regular math on them. If you were to get aNothing
, your equations will still produce the right result – without you having to litter checks forNothing
everywhere.
But the example is just what happens for Maybe
. If the extra information was an IO
, then that special operator defined for IO
s would be called instead, and it could do something totally different before performing the addition. (OK, adding two IO Int
s together is probably nonsensical – I’m not sure yet.) (Also, if you paid attention to the Maybe
example, you have noticed that “wrapping a value with extra stuff” is not always correct. But it’s hard to be exact, correct and precise without being inscrutable.)
Basically, “monad” roughly means “pattern”. But instead of a book full of informally explained and specifically named Patterns, you now have a language construct – syntax and all – that allows you to declare new patterns as things in your program. (The imprecision here is all the patterns have to follow a particular form, so a monad is not quite as generic as a pattern. But I think that’s the closest term that most people know and understand.)
And that is why people find monads so confusing: because they are such a generic concept. To ask what makes something a monad is similarly vague as to ask what makes something a pattern.
But think of the implications of having syntactic support in the language for the idea of a pattern: instead of having to read the Gang of Four book and memorise the construction of a particular pattern, you just write code that implements this pattern in an agnostic, generic way once and then you are done! You can then reuse this pattern, like Visitor or Strategy or Façade or whatever, just by decorating the operations in your code with it, without having to re-implement it over and over!
So that is why people who understand monads find them so useful: it’s not some ivory tower concept that intellectual snobs pride themselves on understanding (OK, that too of course, teehee), but actually makes code simpler.
You should first understand what a functor is. Before that, understand higher-order functions.
A higher-order function is simply a function that takes a function as an argument.
A functor is any type construction T for which there exists a higher-order function, call it map
, that transforms a function of type a -> b
(given any two types a
and b
) into a function T a -> T b
. This map
function must also obey the laws of identity and composition such that the following expressions return true for all x
, p
, and q
(Haskell notation):
map (\x -> x) x == x
map (p . q) x == map p (map q x)
For example, a type constructor called List
is a functor if it comes equipped with a function of type (a -> b) -> List a -> List b
which obeys the laws above. The only practical implementation is obvious. The resulting List a -> List b
function iterates over the given list, calling the (a -> b)
function for each element, and returns the list of the results.
A monad is essentially just a functor T
with two extra methods, join
, of type T (T a) -> T a
, and unit
(sometimes called return
, fork
, or pure
) of type a -> T a
. For lists in Haskell:
join :: [[a]] -> [a]
pure :: a -> [a]
Why is that useful? Because you could, for example, map
over a list with a function that returns a list. Join
takes the resulting list of lists and concatenates them. List
is a monad because this is possible.
You can write a function that does map
, then join
. This function is called bind
, or flatMap
, or (>>=)
, or (=<<)
. This is normally how a monad instance is given in Haskell.
There are some additional properties that a monad must have, but this is basically all there is to it.
First: The term monad is a bit vacuous if you are not a mathematician. An alternative term is computation builder which is a bit more descriptive of what they are actually useful for.
You ask for practical examples:
Example 1: List comprehension:
[x*2 | x<-[1..10], odd x]
This expression returns the doubles of all odd numbers in the range from 1 to 10. Very useful!
Example 2: Input/Output:
do
putStrLn "What is your name?"
name <- getLine
putStrLn ("Welcome, " ++ name ++ "!")
Both examples uses monads aka computation builders. The common theme is that the monad chains operations in some specific, useful way. In the list comprehension, the operations are chained such that if an operation returns a list, then the following operations are performed on every item in the list. The IO monad OTOH performs the operations sequentially, but passes a "hidden variable" along, which represents "the state of the world", which allows us to write IO code in a pure functional manner.
It turns out the the pattern of chaining operations is quite useful, and is used for lots of different things in Haskell.
An other example is exceptions: Using the Error
monad, operations are chained such that the are performed sequentially, except if an error is thrown, in which case the rest of the chain is abandoned.
Both the list-comprehension syntax and the do-notation are syntactic sugar for chaining operations using the >>=
operator. A monad is basically just a type that supports the >>=
operator.
Example 3: A parser
This is a very simple parser which parses either a quoted string or a number:
parseExpr = parseString <|> parseNumber
parseString = do
char '"'
x <- many (noneOf "\"")
char '"'
return (StringValue x)
parseNumber = do
num <- many1 digit
return (NumberValue (read num))
The operations char
, digit
etc. are pretty simple, they either match or don't match. The magic is the monad which manages the control flow: The operations are performed sequentially until a match fail, in which case the monad backtracks to the latest <|>
and tries the next option. Again, a way of chaining operations with some additional, useful semantics.
Example 4: Asynchronous programming
The above examples are in Haskell, but it turns out F# also supports monads. This example is stolen from Don Syme:
let AsyncHttp(url:string) =
async { let req = WebRequest.Create(url)
let! rsp = req.GetResponseAsync()
use stream = rsp.GetResponseStream()
use reader = new System.IO.StreamReader(stream)
return reader.ReadToEnd() }
This method fetches a web page. The punch line is the use of GetResponseAsync
- it actually waits for the response on a separate thread, while the main thread returns from the function. The last three lines are executed on the spawned thread when the response have been received.
In most other languages you would have to explicitly create a separate function for the lines that handle the response. The async
monad is able to "split" the block on its own and postpone the execution of the latter half. (The async {}
syntax indicates that the control flow in the block is defined by the async
monad)
How they work
So how can a monad do all these fancy control-flow thing? What actually happens in a do-block (or a computation expression as they are called in F#), is that every operation (basically every line) is wrapped in a separate anonymous function. These functions are then combined using the bind
operator (spelled >>=
in Haskell). Since the bind
operation combines functions, it can execute them as it sees fit: sequentially, multiple times, in reverse, discard some, execute some on a separate thread when it feels like it and so on.
As an example, this is the expanded version of the IO-code from example 2:
putStrLn "What is your name?"
>>= (\_ -> getLine)
>>= (\name -> putStrLn ("Welcome, " ++ name ++ "!"))
This is uglier, but it's also more obvious what is actually going on. The >>=
operator is the magic ingredient: It takes a value (one the left side) and combines it with a function (on the right side), to produce a new value. This new value is then taken by the next >>=
operator and again combined with a function to produce a new value. >>=
can be viewed as a mini-evaluator.
Note that >>=
is overloaded for different types, so every monad has its own implementation of >>=
. (All the operations in the chain have to be of the type of the same monad though, otherwise the >>=
operator wont work.)
The simplest possible implementation of >>=
just takes the value on the left and applies it to the function on the right and returns the result, but as said before, what makes the whole pattern useful is when there is something extra going on in the monads implementation of >>=
.
There is some additional cleverness in how the values are passed from one operation to the next, but this requires a deeper explanation of the Haskell type system.
Summing up
In Haskell-terms a monad is a parameterized type which is an instance of the Monad type class, which defines >>= along with a few other operators. In layman's terms, a monad is just a type for which the >>=
operation is defined.
In itself >>=
is just a cumbersome way of chaining functions, but with the presence of the do-notation which hides the "plumbing", the monadic operations turns out to be a very nice and useful abstraction, useful many places in the language, and useful for creating your own mini-languages in the language.
Why are monads hard?
For many Haskell-learners, monads are an obstacle they hit like a brick wall. It's not that monads themselves are complex, but that the implementation relies on many other advanced Haskell features like parameterized types, type classes, and so on. The problem is that Haskell IO is based on monads, and IO is probably one of the first things you want to understand when learning a new language - after all, its not much fun to create programs which doesn't produce any output. I have no immediate solution for this chicken-and-egg problem, except treating IO like "magic happens here" until you have enough experience with other parts of language. Sorry.
I've been thinking of Monads in a different way, lately. I've been thinking of them as abstracting out execution order in a mathematical way, which makes new kinds of polymorphism possible.
If you're using an imperative language, and you write some expressions in order, the code ALWAYS runs exactly in that order.
And in the simple case, when you use a monad, it feels the same -- you define a list of expressions that happen in order. Except that, depending on which monad you use, your code might run in order (like in IO monad), in parallel over several items at once (like in the List monad), it might halt partway through (like in the Maybe monad), it might pause partway through to be resumed later (like in a Resumption monad), it might rewind and start from the beginning (like in a Transaction monad), or it might rewind partway to try other options (like in a Logic monad).
And because monads are polymorphic, it's possible to run the same code in different monads, depending on your needs.
Plus, in some cases, it's possible to combine monads together (with monad transformers) to get multiple features at the same time.
In addition to the excellent answers above, let me offer you a link to the following article (by Patrick Thomson) which explains monads by relating the concept to the JavaScript library jQuery (and its way of using "method chaining" to manipulate the DOM): jQuery is a Monad
The jQuery documentation itself doesn't refer to the term "monad" but talks about the "builder pattern" which is probably more familiar. This doesn't change the fact that you have a proper monad there maybe without even realizing it.
Monads are to control flow what abstract data types are to data.
In other words, many developers are comfortable with the idea of Sets, Lists, Dictionaries (or Hashes, or Maps), and Trees. Within those data types there are many special cases (for instance InsertionOrderPreservingIdentityHashMap).
However, when confronted with program "flow" many developers haven't been exposed to many more constructs than if, switch/case, do, while, goto (grr), and (maybe) closures.
So, a monad is simply a control flow construct. A better phrase to replace monad would be 'control type'.
As such, a monad has slots for control logic, or statements, or functions - the equivalent in data structures would be to say that some data structures allow you to add data, and remove it.
For example, the "if" monad:
if( clause ) then block
at it's simplest has two slots - a clause, and a block. The if monad is usually built to evaluate the result of the clause, and if not false, evaluate the block. Many developers are not introduced to monads when they learn 'if', and it just isn't necessary to understand monads to write effective logic.
Monads can become more complicated, in the same way that data structures can become more complicated, but there are many broad categories of monad that may have similar semantics, but differing implementations and syntax.
Of course, in the same way that data structures may be iterated over, or traversed, monads may be evaluated.
Compilers may or may not have support for user defined monads. Haskell certainly does. Ioke has some similar capabilities, athough the term monad is not used in the language.
If you can read ML syntax, a short, accessible explanation with practical, simple code is here.