views:

2156

answers:

3

I am always interested in learning new languages, a fact that keeps me on my toes and makes me (I believe) a better programmer. My attempts at conquering Haskell come and go - twice so far - and I decided it was time to try again. 3rd time's the charm, right?

Nope. I re-read my old notes... and get disappointed :-(

The problem that made me lose faith last time, was an easy one: permutations of integers. i.e. from a list of integers, to a list of lists - a list of their permutations:

[int] -> [[int]]

This is in fact a generic problem, so replacing 'int' above with 'a', would still apply.

From my notes:

I code it first on my own, I succeed. Hurrah!

I send my solution to a good friend of mine - Haskell guru, it usually helps to learn from gurus - and he sends me this, which I am told, "expresses the true power of the language, the use of generic facilities to code your needs". All for it, I recently drank the kool-aid, let's go:

permute :: [a] -> [[a]]
permute = foldr (concatMap.ins) [[]]
   where ins x []     = [[x]]
         ins x (y:ys) = (x:y:ys):[ y:res | res <- ins x ys]

Hmm. Let's break this down:

bash$ cat b.hs
ins x []     = [[x]]
ins x (y:ys) = (x:y:ys):[ y:res | res <- ins x ys]

bash$ ghci
Prelude> :load b.hs
[1 of 1] Compiling Main             ( b.hs, interpreted )
Ok, modules loaded: Main.

*Main> ins 1 [2,3]
[[1,2,3],[2,1,3],[2,3,1]]

OK, so far, so good. Took me a minute to understand the second line of "ins", but OK: It places the 1st arg in all possible positions in the list. Cool.

Now, to understand the foldr and concatMap. in "Real world Haskell", the DOT was explained...

(f . g) x

...as just another syntax for...

f (g x) 

And in the code the guru sent, DOT was used from a foldr, with the "ins" function as the fold "collapse":

*Main> let g=concatMap . ins
*Main> g 1 [[2,3]]
[[1,2,3],[2,1,3],[2,3,1]]

OK, since I want to understand how the DOT is used by the guru, I try the equivalent expression according to the DOT definition, (f . g) x = f (g x) ...

*Main> concatMap (ins 1 [[2,3]])

<interactive>:1:11:
     Couldn't match expected type `a -> [b]'
            against inferred type `[[[t]]]'
     In the first argument of `concatMap', namely `(ins 1 [[2, 3]])'
     In the expression: concatMap (ins 1 [[2, 3]])
     In the definition of `it': it = concatMap (ins 1 [[2, 3]])

What!?! Why? OK, I check concatMap's signature, and find that it needs a lambda and a list, but that's just a human thinking; how does GHC cope? According to the definition of DOT above...

(f.g)x = f(g x), 

...what I did was valid, replace-wise:

(concatMap . ins) x y = concatMap (ins x y)

Scratching head...

*Main> concatMap (ins 1) [[2,3]]
[[1,2,3],[2,1,3],[2,3,1]]

So... The DOT explanation was apparently too simplistic... DOT must be somehow clever enough to understand that we in fact wanted "ins" to get curri-ed away and "eat" the first argument - thus becoming a function that only wants to operate on [t] (and "intersperse" them with '1' in all possible positions).

But where was this specified? How did GHC knew to do this, when we invoked:

*Main> (concatMap . ins) 1 [[2,3]]
[[1,2,3],[2,1,3],[2,3,1]]

Did the "ins" signature somehow conveyed this... "eat my first argument" policy?

*Main> :info ins
ins :: t -> [t] -> [[t]]        -- Defined at b.hs:1:0-2

I don't see nothing special - "ins" is a function that takes a 't', a list of 't', and proceeds to create a list with all "interspersals". Nothing about "eat your first argument and curry it away".

So there... I am baffled. I understand (after an hour of looking at the code!) what goes on, but... God almighty... Perhaps GHC makes attempts to see how many arguments it can "peel off"?

  let's try with no argument "curried" into "ins",
  oh gosh, boom, 
  let's try with one argument "curried" into "ins",
  yep, works,
  that must be it, proceed)

Again - yikes...

And since I am always comparing the languages I am learning with what I already know, how would "ins" look in Python?

a=[2,3]
print [a[:x]+[1]+a[x:] for x in xrange(len(a)+1)]

[[1, 2, 3], [2, 1, 3], [2, 3, 1]]

Be honest, now... which is simpler?

I mean, I know I am a newbie in Haskell, but I feel like an idiot... Looking at 4 lines of code for an hour, and ending up assuming that the compiler... tries various interpretations until it finds something that "clicks"?

To quote from Lethal weapon, "I am too old for this shit"...

+64  A: 
(f . g) x = f (g x)

This is true. You concluded from that that

(f . g) x y = f (g x y)

must also be true, but that is not the case. In fact, the following is true:

(f . g) x y = f (g x) y

which is not the same.

Why is this true? Well (f . g) x y is the same as ((f . g) x) y and since we know that (f . g) x = f (g x) we can reduce that to (f (g x)) y, which is again the same as f (g x) y.

So (concatMap . ins) 1 [[2,3]] is equivalent to concatMap (ins 1) [[2,3]]. There is no magic going on here.

Another way to approach this is via the types:

. has the type (b -> c) -> (a -> b) -> a -> c, concatMap has the type (x -> [y]) -> [x] -> [y], ins has the type t -> [t] -> [[t]]. So if we use concatMap as the b -> c argument and ins as the a -> b argument, then a becomes t, b becomes [t] -> [[t]] and c becomes [[t]] -> [[t]] (with x = [t] and y = [t]).

So the type of concatMap . ins is t -> [[t]] -> [[t]], which means a function taking a whatever and a list of lists (of whatevers) and returning a list of lists (of the same type).

sepp2k
Oh how I wish the guru had responded the way you did. I asked him, of course - and he confirmed that Haskell indeed tried combinations to see what works! ... OK, you win, here I go for the 3rd time, diving in... (next time, btw, I'll ask Stack Overflow :-)
ttsiodras
Haskell doesn't "try combinations", it mechanically follows the definition. What is the definition? You can use hoogle and haddock to find the source really fast: `(.) f g x = f (g x)`. So yes, just ONE argument.
TomMD
@TomMD: Me a newbie: what do you mean, about Hoogle, exactly? I found it, typed the expression in, an error came up. Elaborate?
ttsiodras
@tts: If you enter `.` into hoogle you get a list of functions named `.`. If you click the first result, you'll get to the docs for the standard (prelude) `.` function. If you click the "source" link there, you'll see the definition of `.` in the standard library, which is `(.) f g x = f (g x)` (which is just another way of writing `(f . g) x = f (g x)`).
sepp2k
@ttsiodras: As sepp2k said, search for [(.)](http://haskell.org/hoogle/?hoogle=%28.%29), click on the [search hit](http://haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Prelude.html#v:.), click on where the haddock documentation says [source](http://haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/src/GHC-Base.html#.).
TomMD
@ttsiodras: I have to say, either there was some sort of miscommunication when you asked him, or your friend isn't really much of a "Haskell guru"...
camccann
@sepp2k: one small error in your last paragraph: the type of concatMap is `t -> [[t]] -> [[t]]` . Otherwise, very nice explanation.
ja
@ja: Bah, of course. Fixed it.
sepp2k
+9  A: 

I'd like to add my two cents. The question and answer make it sound like . is some magical operator that does strange things with re-arranging function calls. That's not the case. . is just function composition. Here's an implementation in Python:

def dot(f, g):
    def result(arg):
        return f(g(arg))
    return result

It just creates a new function which applies g to an argument, applies f to the result, and returns the result of applying f.

So (concatMap . ins) 1 [[2, 3]] is saying: create a function, concatMap . ins, and apply it to the arguments 1 and [[2, 3]]. When you do concatMap (ins 1 [[2,3]]) you're instead saying, apply the function concatMap to the result of applying ins to 1 and [[2, 3]] - completely different, as you figured out by Haskell's horrendous error message.

UPDATE: To stress this even further. You said that (f . g) x was another syntax for f (g x). This is wrong! . is just a function, as functions can have non-alpha-numeric names (>><, .., etc., could also be function names).

Claudiu
I don't think this would work the same way in python do to lack of currying.
mathepic
you're right, I think.. i'm assuming one-arg functions for the composition. you will have to convert multi-arg funcs into funcs that take one argument and return other funcs to get the curry effect
Claudiu
For what it's worth, `..` is an invalid identifer in Haskell (special syntax).
trinithis
@trinithis: However, `(...)` is valid, as are `(--:)`, `(:::)`, `(<--)`, `(@.@)`, and various other names containing special syntax as a substring.
camccann
+1 for the operators-are-just-functions. It is so natural for me that operators are functions that I didn't realized that he thought that `.` was a special form.
gawi
+4  A: 

You're overthinking this problem. You can work it all out using simple equational reasoning. Let's try it from scratch:

permute = foldr (concatMap . ins) [[]]

This can be converted trivially to:

permute lst = foldr (concatMap . ins) [[]] lst

concatMap can be defined as:

concatMap f lst = concat (map f lst)

The way foldr works on a list is that (for instance):

-- let lst = [x, y, z]
foldr f init lst
= foldr f init [x, y, z]
= foldr f init (x : y : z : [])
= f x (f y (f z init))

So something like

permute [1, 2, 3]

becomes:

foldr (concatMap . ins) [[]] [1, 2, 3]
= (concatMap . ins) 1 
    ((concatMap . ins) 2
       ((concatMap . ins) 3 [[]]))

Let's work through the first expression:

(concatMap . ins) 3 [[]]
= (\x -> concatMap (ins x)) 3 [[]]  -- definition of (.)
= (concatMap (ins 3)) [[]]
= concatMap (ins 3) [[]]     -- parens are unnecessary
= concat (map (ins 3) [[]])  -- definition of concatMap

Now ins 3 [] == [3], so

map (ins 3) [[]] == (ins 3 []) : []  -- definition of map
= [3] : []
= [[3]]

So our original expression becomes:

foldr (concatMap . ins) [[]] [1, 2, 3]
= (concatMap . ins) 1 
    ((concatMap . ins) 2
       ((concatMap . ins) 3 [[]]))
= (concatMap . ins) 1 
    ((concatMap . ins) 2 [[3]]

Let's work through

(concatMap . ins) 2 [[3]]
= (\x -> concatMap (ins x)) 2 [[3]]
= (concatMap (ins 2)) [[3]]
= concatMap (ins 2) [[3]]     -- parens are unnecessary
= concat (map (ins 2) [[3]])  -- definition of concatMap
= concat (ins 2 [3] : [])
= concat ([[2, 3], [3, 2]] : [])
= concat [[[2, 3], [3, 2]]]
= [[2, 3], [3, 2]]

So our original expression becomes:

foldr (concatMap . ins) [[]] [1, 2, 3]
= (concatMap . ins) 1 [[2, 3], [3, 2]]
= (\x -> concatMap (ins x)) 1 [[2, 3], [3, 2]]
= concatMap (ins 1) [[2, 3], [3, 2]]
= concat (map (ins 1) [[2, 3], [3, 2]])
= concat [ins 1 [2, 3], ins 1 [3, 2]] -- definition of map
= concat [[[1, 2, 3], [2, 1, 3], [2, 3, 1]], 
          [[1, 3, 2], [3, 1, 2], [3, 2, 1]]]  -- defn of ins
= [[1, 2, 3], [2, 1, 3], [2, 3, 1], 
   [1, 3, 2], [3, 1, 2], [3, 2, 1]]

Nothing magical here. I think you may have been confused because it's easy to assume that concatMap = concat . map, but this is not the case. Similarly, it may seem like concatMap f = concat . (map f), but this isn't true either. Equational reasoning will show you why.

mvanier