views:

171

answers:

4

Is a statically-typed full Lisp variant possible? Does it even make sense for something like this to exist? I believe one of a Lisp language's virtues is the simplicity of its definition. Would static typing compromise this core principle?

+3  A: 

My answer, without a high degree of confidence is probably. If you look at a language like SML, for example, and compare it with Lisp, the functional core of each is almost identical. As a result, it doesn't seem like you would have much trouble applying some kind of static typing to the core of Lisp (function application and primitive values).

Your question does say full though, and where I see some of the problem coming in is the code-as-data approach. Types exist at a more abstract level than expressions. Lisp doesn't have this distinction - everything is "flat" in structure. If we consider some expression E : T (where T is some representation of its type), and then we consider this expression as being plain ol' data, then what exactly is the type of T here? Well, it's a kind! A kind is a higher, order type, so let's just go ahead and say something about that in our code:

E : T :: K

You might see where I'm going with this. I'm sure by separating out the type information from the code it would be possible to avoid this kind of self-referentiality of types, however that would make the types not very "lisp" in their flavour. There are probably many ways around this, though it's not obvious to me which would be the best one.

EDIT: Oh, so with a bit of googling, I found Qi, which appears to be very similar to Lisp except that it's statically typed. Perhaps it's a good place to start to see where they made changes to get the static typing in there.

Gian
+8  A: 

Yes, it's very possible, although a standard HM-style type system is usually the wrong choice for most idiomatic Lisp/Scheme code. See Typed Racket for a recent language that is a "Full Lisp" (more like Scheme, actually) with static typing.

Eli Barzilay
The problem here is, what is the type of the list that makes up the entire source code of a typed racket program?
Lajla
That would usually be `Sexpr`.
Eli Barzilay
+3  A: 

Dylan: http://visualization.dylan-user.org/diploma.pdf

Rainer Joswig
+1  A: 

If all you wanted was a statically typed language that looked like Lisp, you could do that rather easily, by defining an abstract syntax tree that represents your language and then mapping that AST to S-expressions. However, I don't think I would call the result Lisp.

If you want something that actually has Lisp-y characteristics besides the syntax, it's possible to do this with a statically typed language. However, there are many characteristics to Lisp that are difficult to get much useful static typing out of. To illustrate, let's take a look at the list structure itself, called the cons, which forms the primary building block of Lisp.

Calling the cons a list, though (1 2 3) looks like one, is a bit of a misnomer. For example, it's not at all comparable to a statically typed list, like C++'s std::list or Haskell's list. Those are single-dimensional linked lists where all the cells are of the same type. Lisp happily allows (1 "abc" #\d 'foo). Plus, even if you extend your static-typed lists to cover lists-of-lists, the type of these objects requires that every element of the list is a sublist. How would you represent ((1 2) 3 4) in them?

Lisp conses form a binary tree, with leaves (atoms) and branches (conses). Further, the leaves of such a tree may contain any atomic (non-cons) Lisp type at all! The flexibility of this structure is what makes Lisp so good at handling symbolic computation, ASTs, and transforming Lisp code itself!

So how would you model such a structure in a statically typed language? Let's try it in Haskell, which has an extremely powerful and precise static type system:

type Symbol = String
data Atom = ASymbol Symbol | AInt Int | AString String | Nil
data Cons = CCons Cons Cons 
            | CAtom Atom

Your first problem is going to be the scope of the Atom type. Clearly, we haven't picked an Atom type of sufficient flexibility to cover all the types of objects we want to sling around in conses. Instead of trying to extend the Atom data structure as listed above (which you can clearly see is brittle), let's say we had a magical type class Atomic that distinguished all the types we wanted to make atomic. Then we might try:

class Atomic a where ?????
data Atomic a => Cons a = CCons Cons Cons 
                          | CAtom a

But this won't work because it requires all atoms in the tree to be of the same type. We want them to be able to differ from leaf to leaf. A better approach requires the use of Haskell's existential quantifiers:

class Atomic a where ?????
data Cons = CCons Cons Cons 
            | forall a. Atomic a => CAtom a 

But now you come to the crux of the matter. What can you do with atoms in this kind of structure? What structure do they have in common that could be modeled with Atomic a? What level of type safety are you guaranteed with such a type? Note we haven't added any functions to our type class, and there's a good reason: the atoms share nothing in common in Lisp. Their supertype in Lisp is simply called t (i.e. top).

In order to use them, you'd have to come up with mechanisms to dynamically coerce the value of an atom to something you can actually use. And at that point, you've basically implemented a dynamically typed subsystem within your statically typed language! (One cannot help but note a possible corollary to Greenspun's Tenth Rule of Programming.)

Note that Haskell provides support for just such a dynamic subsystem with an Obj type, used in conjunnction with a Dynamic type and a Typeable class to replace our Atomic class, that allow arbitrary values to be stored with their types, and explicit coercion back from those types. That's the kind of system you'd need to use to work with Lisp cons structures in their full generality.

What you can also do is go the other way, and embed a statically typed subsystem within an essentially dynamically typed language. This allows you the benefit of static type checking for the parts of your program that can take advantage of more stringent type requirements. This seems to be the approach taken in CMUCL's limited form of precise type checking, for example.

Finally, there's the possibility of having two separate subsystems, dynamically and statically typed, that use contract-style programming to help navigate the transition between the two. That way the language can accommodate usages of Lisp where static type checking would be more of a hindrance than a help, as well as uses where static type checking would be advantageous. This is the approach taken by Typed Racket, as you will see from the comments that follow.

Owen S.
This answer suffers from a fundamental problem: you're assuming that static type systems *must* be HM-style. The basic concept that cannot be expressed there, and is an important feature of Lisp code, is subtyping. If you'll take a look at typed racket, you'll see that it can easily express any kind of list -- including things like `(Listof Integer)` and `(Listof Any)`. Obviously, you'd suspect the latter to be useless because you know nothing about the type, but in TR, you can later use `(if (integer? x) ...)` and the system will know that `x` is an Integer in the 1st branch.
Eli Barzilay
Oh, and it's a bad characterization of typed racket (which is different from unsound type systems that you'd find in some places). Typed Racket *is* a *statically typed* language, with no runtime overhead for typed code. Racket still allows writing some code in TR and some in the usual untyped language -- and for these cases contracts (dynamic checks) are used to guard typed code from the potentially misbehaving untyped code.
Eli Barzilay
@Eli Barzilay: Thanks, and I've got a 3-part response: 1. I think the argument applies to other statically typed systems as well, I merely chose an HM type system for illustration. Like you note, you'd have to do `(Listof Any)` to support an arbitrary Lisp cons tree, right? My general point is the more you try to create a structure to do Lisp-y stuff, the more generic and unhelpful the type you'll be forced to feed the type system.
Owen S.
@Eli Barzilay: 2. `integer?` is possible to code if you're working with `Typeable` objects: `case (cast a :: Int) of Just _ -> True; Nothing -> False`. So yes, Haskell supports this limited kind of subtyping. Usually I'd use type classes in Haskell to do the sorts of things that I might use subtyping for in C++ or Lisp, and if I were to attempt a Lisp using an HM type system I'd want to keep that approach. I think it would be workable. But I haven't tried it. :-)
Owen S.
@Eli Barzilay: 3. You're right, I didn't realize you can get from Typed Racket out to Racket. I think of Typed Racket and untyped Racket as parts of a whole, which I'm sure was your intent. :-) I'll try to clarify.
Owen S.
@Eli Barzilay: I lied, there are four parts: 4. It's interesting to me how industry-accepted C++ coding style has been moving gradually away from subtyping toward generics. The weakness is that the language doesn't provide help for declaring the interface a generic function is going to use, something type classes could certainly help with. Plus, C++0x may be adding type inference. Not HM, I suppose, but creeping in that direction?
Owen S.
Owen: (1) the main point is that you *need* subtypes to express the kind of code lispers write -- and you just can't have that with HM systems, so you're forced to custom types and constructors for each use, which makes the whole thing much more awkward to use. In typed racket using a system with subtypes was a corollary of an intentional design decision: that the result should be able to express the types of such code *without* changing the code or creating custom types.
Eli Barzilay
(2) Yes, `dynamic` types are getting to be popular in static languages as a kind of a workaround to get some of the benefits of dynamically typed languages, with the usual tradeoff of these values being wrapped in a way that makes the types identifiable. But here too typed racket is doing a very good job at making it convenient within the language -- the type checker uses occurrences of predicates to know more about the types. For example, see the typed example on the [racket page](http://racket-lang.org/) and see how `string?` "reduces" a list of strings and numbers to a list of strings.
Eli Barzilay
(3) If you view them as part of a whole then obviously you'd see it as a way to type parts of the code. But in practice, it is very possible (and often done in practice) to use only the typed language so for all practical purpose you are working in a statically typed environment. (As for (4), I don't know how C++ moves, but if it's more features that are hard to use then I'm sure they'll get there...)
Eli Barzilay