+4  A: 

You are correct as to why it hangs--you've created a circular dependency that it can't resolve. Computing the current element requires a later element that can't be computed until the current one is computed, blah blah, around in circles we go.

As for why it doesn't produce an exception, try compiling it instead of running it in GHCi:

$ ghc --make Loop.hs
$ ./Loop.exe
Loop.exe: <<loop>>

I'm assuming this is a NonTermination exception. Pff, Halting Problem? Ha.

Not everything works the way you'd like or expect it to when done in GHCi. If something seems odd, try compiling a small example and see if it makes more sense that way. GHCi using different rules for type defaulting trips me up sometimes, for instance.

camccann
+1. Real men laugh in the face of provable undecidability.
Matt Ball
+8  A: 

"Doesn't exist yet" is not quite right. We try not to think about when values are exist — denotational programming is about eternal unchanging values and equations.

More concretely, this code is fine:

Prelude> let x = [(x !! 1) + 1, 3] in x
[4,3]

You might expect that x !! 1 doesn't exist yet. But here's how an implementation like GHC works.

When building the list f, it constructs an in-memory object (a "thunk") to represent the expression (x !! 1) + 1. No evaluation of x has occurred yet. It wraps a pointer to this thunk, plus a pointer to 3, into a linked list and hands it off to GHCi's implicit show.

Now, the show instance for lists has to show the elements one by one. show will demand ("force") the evaluation of the thunk for (x !! 1) + 1, which causes that code to be "entered". By the definition of (+), forcing (x !! 1) + 1 forces x !! 1, which in turn forces two things:

  • the spine of the list (its cons cells) through the second cell, and
  • the value of the second element (the "car" of the second cell, in Lisp terms), because (+) wants to operate on that value.

And the second value is present by now — it's 3. If it were another thunk we'd force that, and so on. See this blog post for an interesting view on self-referential containers.

Now, how does the compiled GHC code detect an infinite loop in your other example? When we enter a thunk, we need to remember to come back later and overwrite it with the final value. This is what's meant specifically by "lazy evaluation" as opposed to "non-strict semantics", and it prevents us from duplicating work.

Anyway, as an optimization when entering a thunk, GHC's runtime will first overwrite it with another object called a "blackhole". We'll come back later and overwrite the blackhole with the final value. But what happens if we enter a blackhole before doing so? That means that evaluating x requires first evaluating x, an unresolvable loop. So entering a blackhole throws an exception.

keegan
yeah i would expect the example you gave to work, even before reading the description. nice step-by-step, though! now without the black hole thing, lemme try to see why it runs into an infinite loop. so the result of `f = f !! 10 : f` when expanded would be a list where each element is equal to the value of the 11th element. At some point we have to compute the value of the 11th element. This requires the computation of the 11th element, which requires the computation of the 11th element, etc. etc. is that where the loop is getting stuck?
Claudiu
Yes, you've got it. :)
keegan
minor nitpick: in Lisp terms, the value of the second element in a cons-list would be the cadr wouldn't it? the car would be the tail I think.
mokus
@mokus I meant the car of that particular cons cell. Edited for clarity.
keegan
+23  A: 

In this case, a picture may tell a thousand words.

First, remember how cons (the (:) list constructor) works. It's a pair of two things: an element, and a reference to a list tail (which is either another cons, or []).

As you should know, when you say [1, 2, 3], it's just a shortcut for (1:(2:(3:[]))) or 1:2:3:[]. If you visualize each cons pair as a box with two slots, this expression look like:

┌───┬──┐  ┌───┬──┐  ┌───┬──┐  ┌────┐
│ 1 │ ─┼─>│ 2 │ ─┼─>│ 3 │ ─┼─>│ [] │
└───┴──┘  └───┴──┘  └───┴──┘  └────┘

One loop

When you say g = 4 : g, you're not really building an "infinite" list, you're building a circular list: g is defined as a cons whose tail reference simply points back to g itself:

┌──────────┐
│ ┌───┬──┐ │
└>│ 4 │ ─┼─┘
  └───┴──┘

This actually has nothing to do with laziness, and everything with self-reference: for example, you can do the same thing in (eager) Common Lisp using syntax like '#1=(4 . #1#) (where the #1 is like g).

Whether you say g !! 0, or g !! 1000000000000, g never grows: (!!) simply runs around the loop in-place, for as many times as you told it to, until it exhausts itself and returns the element, 4.

Two loops

When you say f = (f !! 10) : f, the same thing happens—except now, the element slot contains a different expression than 4:

┌──────────┐
│ ┌───┬──┐ │
└>│ ╷ │ ─┼─┘
  └─┼─┴──┘
    │
    │ ┌───────────┐
    └>│ (f !! 10) │
      └───────────┘

Crucially, this sub-expression also refers back to f, just like the tail does:

 ┌──────────┐
 │ ┌───┬──┐ │
┌┴>│ ╷ │ ─┼─┘
│  └─┼─┴──┘
│    │
│    │ ┌───────────┐
│    └>│ (f !! 10) │
│      └──┼────────┘
└─────────┘

So, when you ask for f !! n, (!!) will first run around the top loop n times, then return the element, as it did for g. However, instead of escaping the loop, (f !! 10) just re-enters it, and the process repeats itself: looping 10 times around the top, then once around the bottom, and back.

Piet Delport
I applaud your ASCII artistry
MatrixFrog
ohh heh i didn't realize it would be so easy if I thought of it in terms of pointers. and i didn't realize it was just circularity... so I guess the laziness comes in not trying to evaluate `f !! 10` right off the bat (only when I do `f !! 0` for example).
Claudiu
Claudiu: Right, that's where the laziness comes in. In that sense, saying `f = (f !! 10) : f` is for all intents and purposes the same as saying `f = undefined : f`.
Piet Delport