Disclaimer: I can't promise the following is a good way to go about it, but working through it sounds fun. Let's take it for a spin, shall we?.
A few obligatory imports
First, let's toss some data types out there. I'm going to fill in some details and tweak things a bit, in order to define a simple "file system" that we can actually interact with.
type Path = String
type Response = Maybe String
type Contents = [String]
data Operation = Cd Path
| Ls
| MkDir Path
| Quit
deriving (Read, Show)
Next, we'll do something a bit edgy... strip out all the monads. What? This is madness! Perhaps, but sometimes all the hidden plumbing that >>=
provides hides things just a bit too much.
For the file system itself, we'll just store the current working directory and a map from paths to their children. We'll also need a handful of functions to interact with it.
data Filesystem = Filesystem { wd :: Path, files :: M.Map Path Contents }
deriving Show
newFS = Filesystem "/" (M.singleton "/" [])
isDirectory p fs = M.member p $ files fs
children p fs = fromMaybe [] . M.lookup p $ files fs
cd p fs = fs { wd = p }
create p fs = let newPath = wd fs ++ p ++ "/"
addPath = M.insert newPath [] . M.adjust (p:) (wd fs)
in (newPath, fs { files = addPath $ files fs })
Now for a monad-less version of the step
function. It needs to take an Operation
and a Filesystem
, and return a Response
and a (possibly modified) Filesystem
:
step :: Operation -> Filesystem -> (Response, Filesystem)
step (Cd d) fs = (Just "Ok\n", cd d fs)
step (MkDir d) fs = first (\d -> Just $ "Created " ++ d ++ "\n") $ create d fs
step Ls fs = let files = children (wd fs) fs
in (Just $ unlines files, fs)
step Quit fs = (Nothing, fs)
...hmm, that type signature already looks a lot like the guts of a State
monad. Oh well, just ignore it for now, and charge blindly onward.
Now, what we want is a function that will provide a general-purpose interface to a Filesystem
interpreter. Particularly, we want the interface to be at least somewhat self-contained so that whatever uses the interface doesn't have to step through manually, yet we want the interface to be sufficiently oblivious to the code using it that we can wire it up to the IO
monad, some other Monad
, or even no monad at all.
What this tells us primarily is that we'll need to interleave the external code with the interpreter in some fashion, rather than having either part be in control. Now, Haskell is a functional language, so that means that using lots of higher-order functions is good, right? Sounds plausible to me, so here's the strategy we'll use: If a function doesn't know what to do next, we'll hand it another function that we assume does. Repeat until everybody knows what's going on. A flawless plan, no?
The heart of it all is the step
function, so we'll start by just calling that.
interp1 :: Operation -> Filesystem -> (Response, Filesystem)
interp1 op fs = step op fs
...well, it's a start. I guess. But wait, where is the Operation
coming from? We need the external code to provide that, but we can't just ask for it without getting all mixed up with unsavory characters like IO
. So we get another function to do the dirty work for us:
interp2 :: ((Operation -> (Response, Filesystem)) -> t) -> Filesystem -> t
interp2 inp fs = inp (\op -> step op fs)
Of course, now all we have is some stupid t
that we don't even know what it is. We know it has to have a Response
and a Filesystem
in it somewhere, but we can't do anything with it, so we'll hand it back to another function, along with some instructions on how to proceed... which will of course involve passing in yet more functions. It's functions all the way down, you know.
interp3 :: ((Operation -> (Response, Filesystem)) -> a)
-> (a -> ((Response, Filesystem) -> b) -> c)
-> (Filesystem -> b)
-> (String -> Filesystem -> b)
-> Filesystem
-> c
interp3 inp check done out fs = check (inp (\op -> step op fs)) test
where test (Nothing, fs) = done fs
test (Just s, fs) = out s fs
...well that's pretty ugly. But don't worry, all is going according to plan. We can make a couple observations next:
- The type
a
only exists between inp
and check
, so in hindsight, we might as well combine them ahead of time and just pass the composed function to the interpreter.
- When we call
done
, it ought to mean exactly what it says on the tin. So the return type for done
should be the same as the whole interpreter, meaning b
and c
ought to be the same type.
Now, if done
ends the whole thing, what's out
? As the name none-too-subtly implies, it's providing output to the external code, but where does it go after that? It needs to loop back into the interpreter somehow, and we might note that our interpreter is not yet recursive. The way forward is clear--the interpreter, like Jormungand, thus seizes its own tail; looping back around indefinitely till the interpretation finishes (or until Ragnarök, whichever comes first).
interp4 :: ((Operation -> (Response, Filesystem))
-> ((Response, Filesystem) -> r) -> r)
-> (Filesystem -> r)
-> (String -> Filesystem -> (Filesystem -> r) -> r)
-> Filesystem
-> r
interp4 checkInp done out fs = checkInp (\op -> step op fs) test
where loop = interp4 checkInp done out
test (Nothing, fs) = done fs
test (Just s, fs) = out s fs loop
...oh, did I mention that it works now? No, seriously!
Here's some IO
code to use the interface:
ioIn f k = putStr "> " >> (k . f =<< readLn)
ioDone fs = putStrLn "Done" >> return fs
ioOut x fs k = putStr x >> k fs
ioInterp :: IO Filesystem
ioInterp = interp4 ioIn ioDone ioOut newFS
And here's code that runs a list of commands, producing a list of output strings:
scriptIn f k (x:xs) = k (f x) xs
scriptDone fs xs = ["Done\n"]
scriptOut r fs k xs = r : k fs xs
scriptInterp :: [Operation] -> [String]
scriptInterp = interp4 scriptIn scriptDone scriptOut newFS
Examples of running both in GHCi here, if just the code doesn't tickle your imagination sufficiently.
Well, that's that. Or is it? Frankly, that interpreter is code only a mother could love. Is there something that would tie it all together elegantly? Something to reveal the underlying structure of the code?
...okay, so it's pretty obvious where this leads. The overall design of functions tail-calling each other in circles looks an awful lot like continuation-passing style, and not once but twice in the interpreter's type signature can be found the characteristic pattern (foo -> r) -> r
, better known as the continuation monad.
Unfortunately, even after all that, continuations make my head hurt and I'm not sure how best to disentangle the very ad-hoc structure of the interpreter into a computation running in a MonadCont
.