ansaurus

Question

Please confirm or correct my "English interpretation" of this Haskell code snippet

Answer 1

+8 A:

The "Cons" value constructor has a single type parameter.

Nope: you've already parametrized it when you declared data List a. One effective property of this is that if I have a Nil :: List Int, I can't interchange it with a Nil :: List Char.

If you use the "Cons" value constructor, there are 2 fields which must be provided. The first required field is an instance of List. The second required field is an instance of a.

You've got it swapped: the first required field is an instance of a, the second field is an instance of List.

This chapter of Real World Haskell may be of interest.

Thanks. That is the chapter I'm on right now. So ... when the code says "Cons a (List a)", I thought the "Cons a" part of that was stating that the Cons value constructor was parameterised. They haven't yet covered the syntax for parameterised types, so I guessed that the syntax must require re-stating "a" if you intend to use a. But you're saying that's not necessary? And therefore that's not what that "a" means?

Nope. Once we declare a parameter in our type, we get to reuse it otherwise to say "that type should be used there." It's a little bit like an a -> b -> a type signature: a is parameterizing the type, but then I have to use the same a as the return value.

OK, but this is confusing. It seems that the first "a" means "the first field is an instance of a",

Nope, that is not true. It just means that the data type parametrizes over some type a.

and it ALSO means "the first field has the same value as the value they passed in for a". In other words, it specifies type AND value.

No, that is also not true.

Here's an instructive example, the syntax of which you may or may not have seen before:

foo :: Num a => a -> a

This is a fairly standard signature for a function that takes a number and does something to it and gives you another number. What I actually mean by "a number" in Haskell-speak, though, is some arbitrary type "a" that implements the "Num" class.

Thus, this parses to the English:

Let a indicate a type implementing the Num typeclass, then the signature of this method is one parameter with the type a, and the return value of the type a

Something similar happens with data.

It also occurs to me that the instance of List in the specification of Cons is also confusing you: be really careful when parsing that: whereas Cons is specifying a constructor, which is basically a pattern that Haskell is going to wrap the data into, (List a) looks like a constructor but is actually simply a type, like Int or Double. a is a type, NOT a value in any sense of the term.

Edit: In response to the most recent edit.

I think a dissection is first called for. Then I'll deal with your questions point by point.

Haskell data constructors are a little weird, because you define the constructor signature, and you don't have to make any other scaffolding. Datatypes in Haskell don't have any notion of member variable. (Note: there's an alternate syntax that this way of thinking is more amenable to, but let's ignore that for now).

Another thing is that Haskell code is dense; its type signature are like that to. So expect to see the same symbol reused in different contexts. Type inferencing also plays a big role here.

So, back to your type:

data List a = Cons a (List a)
              | Nil

I chunk this into several pieces:

data List a

This defines the name of the type, and any parameterized types that it will have later on. Note that you will only see this show up in other type signatures.

Cons a (List a) |
Nil

This is the name of the data constructor. This IS NOT a type. We can, however, pattern match for it, ala:

foo :: List a -> Bool
foo Nil = True

Notice how List a is the type in the signature, and Nil is both the data constructor and the "thing" we pattern match for.

Cons a (List a)

These are the types of the values we slot into the constructor. Cons has two entries, one is of type a, and one is of type List a.

So my question is, why does "a" in the first field mean "the type of this field is 'a' and the value of this field is 'a'", while "a" in the second field means only "the type of this field is 'List of a'"?

Simple: don't think of it as us specifying the type; think of it has Haskell is inferencing the type out of it. So, for our intents and purposes, we're simply sticking a 0 in there, and a Nil in the second section. Then, Haskell looks at our code and thinks:

Hmm, I wonder what the type of Cons 0 Nil is
Well, Cons is a data constructor for List a. I wonder what the type of List a is
Well, a is used in the first parameter, so since the first parameter is an Int (another simplification; 0 is actually a weird thing that is typeclassed as Num), so that's mean a is a Num
Hey, well, that also means that the type of Nil is List Int, even though there's nothing there that would actually say that

(Note, that's not actually how it's implemented. Haskell can do a lot of weird things while inferencing types, which is partially why the error messages suck.)

Edward Z. Yang 2009-04-22 05:04:49

Thanks. That is the chapter I'm on right now. So ... when the code says "Cons a (List a)", I thought the "Cons a" part of that was stating that the Cons value constructor was parameterised. They haven't yet covered the syntax for parameterised types, so I guessed that the syntax must require re-stating "a" if you intend to use a. But you're saying that's not necessary? And therefore that's not what that "a" means?

Charlie Flowers 2009-04-22 05:15:17

See my edit. We restate a because we're using the parameter.

Edward Z. Yang 2009-04-22 05:22:06

OK, but this is confusing. It seems that the first "a" means "the first field is an instance of a", and it ALSO means "the first field has the same value as the value they passed in for a". In other words, it specifies type AND value. But the *second* "a" specifies only *type*. It says, "The second field must be of type list-of-a". And it doesn't say anything about value. Is that right? If so, what rule determines when a means value-and-type versus type-only?

Charlie Flowers 2009-04-22 05:29:39

See my edit. When we're specifying this a "data" declaration, we're specifying a sort of pattern that, when we call the data constructor latter, the data will "slot itself into". Part of this pattern is the types of the inner bits.

Edward Z. Yang 2009-04-22 06:16:42

OK, one final bit of confusion remains. It seems that a is taken as a value for the first field. See my edited question. Thanks.

Charlie Flowers 2009-04-22 17:14:05

See my response. Type inferencing is the answer!

Edward Z. Yang 2009-04-22 18:26:26

I get it now, thanks to pretty much a combination of many answers here. "a" and "List a" are NOTHING but types. And then when I type "Cons 0 Nil", I'm providing two EXPLICIT values. Those 2 values get assigned to the 2 fields, end of story. I understand how the type inferencing "carries over" from field one to field 2, because C# generics do the same thing (though I hear Haskell is better at it). Thanks.

Charlie Flowers 2009-04-23 02:12:04

Great! Now you should mark an answer on this question ;-)

Edward Z. Yang 2009-04-23 02:34:24

Answer 2

+2 A:

Cons a (List a)

The first field of a Cons is a value of type "a". The second is a value of type "List a", i.e. a List parameterized with the same type as the parameter of the current list.

newacct 2009-04-22 05:11:27

Thanks for the response. I now know that what you say is correct, but I have questions about why that is the case. Please see my comments on the other response (from Ambush Commander). Basically, it seems the first "a" specifies type AND value, but the second "a" specifies only type. Why is that?

Charlie Flowers 2009-04-22 05:52:26

No, first `a` specifies that first parameter to `Cons` constructor is of type `a` and second one is of type `List a`.

Hynek -Pichi- Vychodil 2009-04-22 08:30:14

OK, one final bit of confusion remains. It seems that a is taken as a value for the first field. See my edited question. Thanks.

Charlie Flowers 2009-04-22 17:14:45

Answer 3

+3 A:

5 is wrong, and I'd say 6 as follows to replace both:

Cons{1} a{2} (List a){3} is a constructor called Cons (part before {1}) for a value of type List a (the data List a part) that needs two values: one of type a (the part between {1} and {2}) and one of type List a(the part between {2} and {3}).

To help you with an apparent source of confusion: in Haskell you hardly ever have to give explicit type parameters - type inference infers the types from your values. So in a sense, yes, when you pass in a value to a function or constructor, you also specify a type, viz. the type of the passed-in value.

Kurt Schelfthout 2009-04-22 06:23:02

OK, one final bit of confusion remains. It seems that a is taken as a value for the first field. See my edited question. Thanks.

Charlie Flowers 2009-04-22 17:15:33

Answer 4

+2 A:

Yeah the data syntax is a little confusing, because it puns names and types and doesn't really make a syntactic distinction between them. In particular, in a constructor definition like:

Cons a (List a)

The first word is the name of the constructor; every other word is the name of some predeclared type. So both a and List a are already in scope (the a was brought in to scope by the a in "data List a", and you're saying that those are the types of the parameters. Their role might be better demonstrated by stating the same thing using record syntax:

Cons { head :: a, tail :: List a }

I.e. a value of type List Int, if it was constructed with the Cons constructor, has two fields: an Int and a List Int. If it was constructed with Nil, it has no fields.

luqui 2009-04-22 14:58:31

OK, one final bit of confusion remains. It seems that a is taken as a value for the first field. See my edited question. Thanks.

Charlie Flowers 2009-04-22 17:16:20

Answer 5

+2 A:

When I type "Cons 0 Nil", it uses the "Cons" value constructor to create an instance of List. From 0, it learns that the type parameter is "Integer". So far, no confusion.

However, it also determines that the value of the first field of the Cons is 0. Yet it determines nothing about the value of the second field ... it only determines that the second field has a type of "List Integer".

No, it determines that the second field's value is Nil. Given your definition, Nil is a value of the type List a. Therefore so is Cons 0 Nil. And in Cons 1 it the second field's value is it; i.e., Cons 0 Nil. This is precisely what the REPL shows: Cons 1 (Cons 0 Nil).

Alexey Romanov 2009-04-22 18:19:50

I am starting to see it, from your answer and the other ones here. The values must always be *explicitly* supplied (at least as far as I know :). The *types* are what can be inferred. So "a" and "List a" specify nothing but the *type*. And when I call it, I am giving it (explicitly) a value for field one and a value for field 2.

Charlie Flowers 2009-04-23 02:04:15

Answer 6

+3 A:

Analogies are usually lacking in all sorts of ways, but since you know C# I thought this might be helpful.

This is how I would describe the List a definition in C#, maybe this clears some things up (or more likely, confuses you even more).

class List<A>
{
}

class Nil<A> : List<A>
{
    public Nil() {}
}

class Cons<A> : List<A>
{
    public A Head;
    public List<A> Tail;

    public Cons(A head, List<A> tail)
    {
     this.Head = head;
     this.Tail = tail;
    }
}

As you can see;

the List type as a single type parameter (<A>),
the Nil constructor doesn't have any parameters,
and the Cons constructor has two parameters, a value of type A and a list of type List<A>

Now, in Haskell the Nil and Cons are just constructors for the List a data type, in C# they are also types in and of themselves, so that's where the analogy fails.

But I hope this gives you some intuitive sense of what the different A's represent.

(And please do comment on how this horrible comparison doesn't do justice to Haskell's data types.)

Tom Lokhorst 2009-04-22 20:49:34

Yes, this helps. It emphasizes the fact that "a" and "List a" are just TYPES. I come along later and provide EXPLICIT VALUES when I type "Cons 0 Nil". The first field becomes "0" because I told it to use that value, not because "a" was used in the definition of the value constructor. Your response plus several others have helped me get this.

Charlie Flowers 2009-04-23 02:06:34

Answer 7

+1 A:

I looked at your edited question.

When I create instances using the Cons value constructor, those instances "interpret" the first 'a' as meaning "place the value passed in here.

In "Cons a (List a)", both "a" and "List a" are types. I don't understand what "value" has to with it.

When I type "Cons 0 Nil", it uses the "Cons" value constructor to create an instance of List. From 0, it learns that the type parameter is "Integer". So far, no confusion.

However, it also determines that the value of the first field of the Cons is 0. Yet it determines nothing about the value of the second field ... it only determines that the second field has a type of "List Integer".

The value of the second field is Nil.

So my question is, why does "a" in the first field mean "the type of this field is 'a' and the value of this field is 'a'", while "a" in the second field means only "the type of this field is 'List of a'"?

"a" in the first field means "the type of this field is 'a'". "List a" in the second field means "the type of this field is 'List a'". In the case of "Cons 0 Nil" above, 'a' is inferred to be "Integer". So "Cons a (List a)" becomes "Cons Integer (List Integer)". The 0 is a value of type Integer. The Nil is a value of type "List Integer".

the value of this field is 'a'

I don't understand what you mean by this. 'a' is a type variable; what does it have to do with values?

newacct 2009-04-22 22:39:33

Thanks. Your response plus several others has helped me get it. In "Cons a (List a)", both "a" and "List a" are NOTHING but types. When I later type "Cons 0 Nil", it infers that a is Integer (close enough for now), and sets first field to zero and second field to Nil. I was missing the fact that, while it will infer types, it is NOT inferring values ... those are provided explicitly.

Charlie Flowers 2009-04-23 02:09:42

Answer 8

A:

Just to give you some extra "help", in case you are still watching this thread. Haskell has a couple of conventions that mess up others' ideas of how things ought to be done - in Haskell, a parameterized type is so commonly accepted, that it is usually thought of as being a type-level function. Likewise, the value constructors are thought of as "special" functions, that also allow pattern matching, in addition to their "take a value (or more) and produce a value as a result".

Another "funny" characteristic of Haskell is that it does not explicitly (or implicitly) evaluate the arguments to a function, even if that argument is in parentheses. Let me state that slightly differently: Haskell functions do not evaluate arguments in parentheses before other arguments. Arguments are put in parentheses for grouping purposes only, not to have them evaluated "first". Haskell assigns arguments to ("applies") a function at a higher precedence than any other operation - even higher than an implicit function application of one of its own arguments. This is why the Cons constructor has parens around the second argument, (List a) - to tell the compiler that Cons has two arguments, and not three. The parentheses are for grouping only, and not for precedence!

As somewhat of a side topic, be careful with the types in F#. Since F# has its roots in ML, its parameterized types have the parameters in front - int list, not (List Int) in back! Haskell does it the other way, because it's the same way that Haskell does functions - first the function, then arguments to the function. This encourages a common use pattern, and explains why Haskell types and value constructors are Capitalized - to remind you that you're dealing with a type/class-related thing.

Okay, I'm done; thanks for letting me put this huge Wall O' Text on your property...

BMeph 2010-07-09 23:03:06

ansaurus

tags:

views:

answers:

Please confirm or correct my "English interpretation" of this Haskell code snippet

related questions