views:

219

answers:

8

The following statements represent my understanding of type systems (which suffers from too little hands-on experience outside the Java world); please correct any errors.

The static/dynamic distinction seems pretty clear-cut:

  • Statically typed langauges assign each variable, field and parameter a type and the compiler prevents assignments between incompatible types. Examples: C, Java, Pascal.
  • Dynamically typed languages treat variables as generic bins that can hold anything you want - types are checked (if at all) only at runtime when you actually perform operations on the values, not when you assign them. Examples: Smalltalk, Python, JavaScript.
  • Type inference allows statically typed languages to look like (and have some of the advantages of) dynamically typed ones, by inferring types from the context so that you don't have to declare them most of the time - but unlike in dynamic languages, you cannot e.g. use a variable to hold a string initially and then assign an integer to it. Examples: Haskell, Scala

I am much less certain about the strong/weak distinction, and I suspect that it's not very clearly defined:

  • Strongly typed languages assign each runtime value a type and only allow operations to be performed that are defined for that type, otherwise there is an explicit type error.
  • Weakly typed languages don't have runtime type checks - if you try to perform an operation on a value that it does not support, the results are unpredictable. It may actually do something useful, but more likely you'll get corrupted data, a crash, or some undecipherable secondary error.
  • There seems to be at least two different kinds of weakly typed languages (or perhaps a continuum):
    • In C and assembler, values are basically buckets of bits, so anything is possible and if you get the compiler to dereference the first 4 bytes of a null-terminated string, you better hope it leads somewhere that does not contain legal machine code.
    • PHP and JavaScript are also generally considered weakly typed, but do not consider values to be opaque bit buckets; they will, however, perform implicit type conversions.
  • But these implicit conversions seem to apply mainly to string/integer/float variables - does that really warrant the classification as weakly typed? Or are there other issues where these languages's type system may obfuscate errors?
A: 

Hmm, don't know much more either, but I wanted to mention C++ and its implicit converstions(implicit constructors). This might be as well an example of weak typing.

Gabriel Ščerbák
That's because it was built on the (weakly-typed) foundation of C.
T.E.D.
+1  A: 

Maybe this Book can help. Be prepared for some math though. If I remember correctly, a "non-math" statement was: "Strongly typed: A language that I feel safe to program with".

zedoo
"Be prepared for some math" understatement of the week? And it's only Monday? While that's a great book, I'm not sure it's exactly what the asker is looking for B-) OTOH, your 2nd answer looks about right B-)
Brian Postow
+1  A: 

There seems to be at least two different kinds of weakly typed languages (or perhaps a continuum):

  • In C and assembler, values are basically buckets of bits, so anything is possible and if you get the compiler to dereference the first 4 bytes of a null-terminated string, you better hope it leads somewhere that does not contain legal machine code.

I would disagree with this statement, at least in C. You can manipulate the type system in C in such a way that you can treat any given memory location as a bucket of bits, but a variable most definitely has a type and that type has specific properties. The fact that there are no runtime checks (unless you consider floating point exceptions or segmentation faults to be runtime checks) isn't really relevant. C can be considered "weakly typed" in the sense that the compiler will perform some implicit type conversion for you, but it doesn't go very far with it.

Nathan Parrish
The fact that there are no runtime checks is very much relevant, since the way I understand it, compile-time checks are only relevant to the static/dynamic distinction, whereas "no runtime checks" is the most extreme form of weak typing.
Michael Borgwardt
Only to a very small extent. For example in C (or at least "C classic" that I'm used to) there is no way to define a special integer type that isn't freely assignable to "int". A true strongly typed language gives you that.
T.E.D.
I suppose it's a matter of perspective. If we are talking about *values*, then I guess C is about as weakly-typed as can be, since values really are just a bunch of bits that could be reinterpreted as anything. But in terms of *variables*, the reason there are no run-time checks is because *all* the checking is done at compile time, and there is very little implicit conversion between anything but the most basic built-in datatypes.
Nathan Parrish
That's exactly the point where I get confused, and apparently there is no general consensus nor a clear definition: is weak typing characterized by implicit conversions, or by the absence of runtime checks?
Michael Borgwardt
I agree, it is confusing. And I certainly wouldn't argue that C is purely strongly-typed or weakly-typed - it obviously has different rules for built-in types and user-defined types. My original point was that if you are evaluating C's type system based on how the *data* is represented (i.e. the bits), you are looking at it wrong. You need to look at how *variables* are typed and the relationships between variable types.
Nathan Parrish
+2  A: 

This is a pretty accurate reflection of my own understanding of the topic of the static/dynamic, strong/weak typing discussion. In addition, you can consider those other languages:

In languages such as TCL and Bourne Shell, the "main" value type is the string. Numeric operators are available that implicitly coerce input values from string representation and result values to string representation. They can be considered examples of dynamic, weakly typed languages.

Forth may be an example of a static, weakly typed language. The language performs no type checking of its own, and the main stack may interchangeably contain pointers, integers, strings (conventionally represented as two cells, start and length). Inconsistent use of operators can lead to either interesting, or unspecified behavior. Typical Forth implementations provide a separate stack for floating point numbers.

ddaa
Forth is really a typeless programming language.
mipadi
Forth is no more typeless than assembly. Bytes, integers and floats are different by structure, integers and pointers are different by destination (usually not by structure in modern hardware), and although you can TRY to perform any operation on any bunch of bits, you need to respect SOME type system to get useful behavior.
ddaa
That reminds me of the old Martin Richards joke that BCPL is a strongly typed language with one type.
Stephen C
A: 

I agree with the others who say "there doesn't seem to be a hard and fast definition here." My answer tends to be based on how much rope the language gives you WRT types. If you can pretty much fake anything you want, then it's weak. If it really doesn't let you get yourself into trouble, even if you want to, it's strong.

I really haven't seen too many languages that skirt this border, so I can't say that I've ever needed a better definition that that...

Brian Postow
+1  A: 

I consider strong/weak to be the concept of implicit conversion and a good example is addition of a string and a number. In a strongly typed language the conversion won't happen (at least in all languages I can think of) and you'll get an error. Weakly typed languages like VB (with Option Explicit Off) and Javascript will try to cast one of the operands to the other type.

In VB.Net with Option Strict Off:

    Dim A As String = "5"
    Dim B As Integer = 5
    Trace.WriteLine(A + B) 'returns 10

With Option Strict On (turning VB into a strongly typed language) you'll get a compiler error.

In Javascript:

    var A = '5';
    var B = 5;
    alert(A + B);//returns 55

Some people will say that the results are not predictable but they actually do follow a set of rules.

Chris Haas
Java is generally considered strongly typed, yet it has an overloaded + operator that will happily add a number to a string (but not the other way round).
Michael Borgwardt
@Michael: This is a good example why the simple distinction strong <-> weak isn't helpful. There too many variants in different languages.
sleske
@Michael Borgwardt and @sleske, overloaded operators are actually a great example of a strongly typed language trying to work with weakly typed code. If Java actually overload's their operator then there's a strongly typed method behind the scenes that's accounted for this. In VB and Javascript (AFAIK) they just try to implicitly cast whatever's given to them which is what makes them weak.
Chris Haas
+7  A: 

I am much less certain about the strong/weak distinction, and I suspect that it's not very clearly defined.

You are right: it isn't.

This is what Benjamin C. Pierce, author of Types and Programming Languages and Advanced Types and Programming Languages has to say:

I spent a few weeks... trying to sort out the terminology of "strongly typed," "statically typed," "safe," etc., and found it amazingly difficult.... The usage of these terms is so various as to render them almost useless.

Luca Cardelli, in his Typeful Programming article, defines it as the absence of unchecked run-time type errors. Tony Hoare calls that exact same property "security". Other papers call it "type safety" or simply "safety".

Mark-Jason Dominus wrote a classic rant about this a couple of years ago on the comp.lang.perl.moderated newsgroup, in a discussion about whether or not Perl was strongly typed. In this rant he states that within just a few hours of research, he was able to find 8 different, sometimes contradictory definitions, mostly from respected sources like college textbooks or peer-reviewed papers. In particular, those texts contained examples that were meant to help the students distinguish between strongly and weakly typed languages, and according to those examples, C is strongly typed, C is weakly typed, C++ is strongly typed, C++ is weakly typed, Lisp is strongly typed, Lisp is weakly typed, Perl is strongly typed, Perl is weakly typed. (Does that clear up any confusion?)

The only definition that I have seen consistently applied is:

  • strongly typed: my programming language
  • weakly typed: your programming language
Jörg W Mittag
That sounds just like Benjamin. I disagree in only one regard: remove the "almost" :-) +1
Norman Ramsey
The 'only definition' is hilarious. :D
Arnis L.
It's actually not mine, but I can't remember where I saw it first, and even if I could, it wasn't the original source anyway. Let's chalk it up to folklore.
Jörg W Mittag
+3  A: 
Norman Ramsey