views:

477

answers:

12

Type systems are often criticised, for being to restrictive, that is limiting programming languages and prohibiting programmers to write interesting programmes.

Chris Smith claims:

We get assurance that the program is correct (in the properties checked by this type checker), but in turn we must reject some interesting programs.

and

Furthermore, there is an ironclad mathematical proof that a type checker of any interest at all is always conservative. Building a type checker that doesn't reject any correct programs isn't just difficult; it's impossible.

Could someone please outline what kind interesting programmes this could be? Where is it proven that type checkers have to conservative?

And more general: What are the limits of type checking and type systems?

A: 

People have found that if you write your tests first in a truly TDD manner, your tests typically do a better job at verifying correctness than a strict type checking system would. So for gauging correctness, a strong type system doesn't really measure up.

Strong typing can often buy you some speed because compilers can easily use native types instead of having to do type checks at runtime. Case in point: if you're doing a lot of integer math, you'll find a strongly typed implementation will probably outrun a weakly typed one, since your figures can typically just be immediately used by the CPU and won't have to be verified at runtime.

"Interesting" programs? Sure. It's easier to write extensions off of a dynamic language. Also, in distributed systems, it can be really convenient to have a weakly typed UI and not have to generate specific data transfer objects. For example, if you had a JavaScript front end and a C# backend, you could use LINQ on the C# end to generate anonymous classes and send them to your Javascript via JSON. From the JS layer, you could just use those anonymous types as though they were first-class objects and not have to go through the pain of coding up all of your data contracts explicitly.

Dave Markle
+3  A: 

You can express everything in both static and dynamic language. Proof == you can write any language compiler in any turing complete language. So whatever the language is, you can create the static language that does X.

What can be interesting in dynamic typing?... With good enough duck typing you could interact with objects over the network without ever knowing their types and pass their results (of a type unknown to you) as parameters to local functions which may actually do something useful.

Static answer to that problem would be to wrap everything in "exportable interface" providing .call() and .provides?() working on text name, but that would be definitely harder.

That's the most "limiting" case I know and it's really stretching things a bit (only really useful with mock objects?). As for theoretical limits, there are none - you just need some extra code to overcome the practical issues.

viraptor
Their claims seem to be more existential, because they say that a type checker rejects certain programmes which dynamic typing allows. So the dynamically typed programmes are able to do something statically typed programmes will never be able to, no matter what tricks you use.
ott
@ott: that's not possible. If you give an example of such program A(x), I'll respond with compiler B for that language converted to assembler (which is not a dynamic language). Now B(A,x) is a new program in a static language, doing exactly the same as your program A, but takes A as a text input (data).
viraptor
+1  A: 

It's hard to find objective pragmatic comparisons of static versus dynamic typing issues because it's so often such a religious war. The little summary blubs that you've quoted tend to be the same boilerplate disclaimer everone makes that seems to be 'acceptable' to everyone these days.

As someone who's experience is mostly in statically-typed languages, I tried to understand some of the trade-offs in a blog series a while back. Lots of caveats, but you might check out the second half of this blog entry for some comparison that is suggestive as the answer to your question.

Here's one quote that suggests a window into the trade-offs:

In a sense, this tiny function captures the essence of the ongoing debate between static and dynamic typing. The programmer can create domain-specific static types to structure the program, convey intent, and rule out a class of errors and behaviors at compile-time, at the price of having to author that structure and mediate structural mismatches at module boundaries. Or the programmer can choose to compute most everything with just scalars and lists, in which case data flows easily everywhere, resulting in short code, but with the loss of compile-time checks and conveyance of intent.

and the running example shows a case where the statically-typed program disallows a useful/interesting behavior by its nature.

Brian
A: 

I think an eval function could be handy sometimes, but never necessary (it is uncommon for a statically typed language, see the explanation on the link).

Tamás Szelei
I don't see what eval() has to do with type systems. You can have statically and dynamically typed languages which have an eval() function.
ott
@ott: can you give me an example of eval in a statically typed language?
Tamás Szelei
@sztomi: Try ghci.
ott
@ott: can you be more specific? I looked up it's documentation (http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html), but I was unable to find an eval function. The closest thing was an interactive prompt. I'm really interested in it's inner workings if it does not compile the expression at runtime.
Tamás Szelei
@sztomi: I think it compiles the code at runtime, but I'm the wrong person to ask about this. Maybe you should ask on haskell-cafe.
ott
A: 

Here's a simple example (in Python, but completely unrelated to its typing issues):

# The company deals with rectangles. They have horizontal and vertical lengths
# that should not be mixed. Programmer A writes:

class Rectange:
    def __init__(self, width, height):
        enforce_type('horizontal', width)
        enforce_type('vertical', height)
        #
        # A: Hehe! I'm so smart! With my patented type system if I try to
        #    write "width * width" I'll have a loud error so I'll know I'm wrong.  
        #
        area = width * height
        enforce_type('square_meters', area)
        ...

# Everyone's happy. The company shows off A's type system on the conference.

...

# Much later, the clients request ability to specify square by entering only 
# one length. Programmer B writes:

class Square(Rectangle):
    def __init__(self, width):
         Rectange.__init__(self, width, width)
         # !!
         # Error happens! 'horizontal' is not 'vertical'!
         #
         # B: Dear Management, I would like to request a weeklong leave since I 
         #    have to track Programmer A's house and either talk him into changing
         #    his patented type system or BEAT HIM WITH MY LAPTOP until he agrees.
         #

It's very hard to create a type system that would foresee any type of possible exceptions from the rules, especially if you're creating base framework that will be used by people much later.

To answer your direct question, on whose side you're on: Programmer A or Programmer B?

ilya n.
I think that's a bad example, both `width` and `height` should have type `meters` as `area` has type `square_meters`. Then the static typing would prevent programmer C from calculating `10 foot * 3 meters`.
swegi
To be more specific, programmer A misused the static type system as programmer D could misuse a dynamic typed or untyped language to write `99 + "bottles of beer" / 3 * "people"`
swegi
This is exactly the tradeoff quoted in the question between being more **correct** or more **interesting**. Programmer A didn't really misuse the typing system since he was told by the management that it's only possible to multiply vertical and horizontal.
ilya n.
+2  A: 

An interesting example is This Paper, which is probably the only apples-to-apples comparison ever done on static vs dynamic typing. They implemented self (a language like smalltalk, but with prototypes instead of classes,) with both type inference (static), and type feedback (dynamic).

The most interesting result is that the type inference engine had a lot of trouble resolving between machine integers and arbitrary precision integers -- They auto-promoted in vanilla self, and hence could not be told apart by the type system, which meant that the compiler had to include code to promote to BigInt on every integer operation.

They ran up against a limit of their type system: It could not examine the actual value of integers.

I think that there are no theoretical limits to type systems in general, but any given type checker can only deal with a specific, limited type system, and there will be programs where it cannot determine what is going on. Since the self type inferencer allowed for sets of types, it simply compiled both paths. A type checker that required convergence on a single type would have to reject the program. (Though it would probably have special code for this case.)

Sean McMillan
A: 

I am not sure but I believe the issues you are referring to are with Algebraic Type Systems like Haskell and ML. These languages try to do very complete "static" analysis of the types before you even run the program.

Some interesting annoyances with very strict static analysis in algebraic type systems is that it is very difficult to have a container that contains a mixture of objects of varying types.

For example in most mainstream languages you can have a heterogeneous mixture of types in a list. An example in python would be:

["a",1,"b",2]

To do this in a strict type system you would have to make a wrapping unified type and then pattern match all the possible types.

But the real interesting thing that is missing from the languages is powerful introspection that you have in most modern language (eg. C#,Java,Python,Ruby,Lisp).

Consequently these languages allow you to do some powerful meta-programming and data binding that you cannot do with complete static analysis.

Adam Gent
A: 

The following is an excerpt from a correct program that would get rejected (unless you have an Any type at your disposal):

def signum(x):
   if f > 0:
        return 1
   else:
        return "non-positive"

def abs(x):
   if signum(x) == 1:
       return x
   else:
       return -x
Adrian Panasiuk
A: 

Spec# has a type system with non null types and it is statically safe. (http://research.microsoft.com/en-us/projects/specsharp/)

thequark
A: 

What are the limits of type checking and type systems?

I'm going to assume that "type systems" implies a statically typed programming language.

Static typing means that types are checked at compile time. The limitation here is that the compiler can only optimize based on the information that is available at compile time. This could be considered a limitation. In dynamically typed programming languages it would be possible for the interpreter to analyze and optimize the program at runtime. This means it can optimize based on usage patterns.

StackedCrooked
Although there is profile-guided optimization, which partly takes care of the "optimization by usage pattern". The disadvantage is that the optimization process is not transparent to the user, and one cannot easily reoptimize it during runtime.
Lajnold
A: 

I think the first statement is technically wrong, though correct in practice.

Static typing is essentially the same as proving properties of programs, and by using a sufficiently powerful logic you could prove that all interesting programs are correct.

The problem is that for powerful logics type inference no longer works, and you have to provide the proofs as part of the program so that the type checker can do its job. A concrete example are higher order provers such as Coq. Coq uses an extremely powerful logic, but it is also extremely tedious to get anything done in Coq, because you have to provide proofs in great detail.

The kind of interesting programs that will give you the most trouble will be interpreters, as their behavior depends entirely on an input. Essentially you will need to reflectively prove the correctness of type checking for that input.

As for the second statement, he may be referring to Gödels incompleteness theorem. It states that for any given proof system there are true statements of arithmetic (the logic of addition and multiplication of natural numbers) that cannot be proved in the proof system. Translated to static type systems you would have a program that doesn't do anything bad but the static type system couldn't prove that.

These counterexamples are constructed by referring back to the definition of the proof system, saying essentially "I cannot be proved" translated into arithmetic, which is not very interesting. IMHO a program constructed analogously wouldn't be interesting either.

starblue
A: 

If you look at Mitchell's book Concepts in Programming Languages p.134, you'll find some details on what is called the "Conservativity of Compile-Time Checking". The problem is that some "interesting" features like out-of-bounds accesses for arrays cannot be checked statically, as they would require evaluation of the program/every possible program-run. The standard undecidability-result tells you'd have to solve the Halting-problem to indeed check EVERY array access.

ShiDoiSi