views:

287

answers:

6

Null pointers have been described as the "billion dollar mistake". Some languages have reference types which can't be assigned the null value.

I wonder if in designing a new object-oriented language whether the default behavior should be for references to prevent being assigned null. A special version of the could then be used to override this behavior. For example:

MyClass notNullable = new MyClass();
notNullable = null; // Error!
// a la C#, where "T?" means "Nullable<T>"
MyClass? nullable = new MyClass();
nullable = null; // Allowed

So my question is, is there any reason not to do this in a new programming language?

EDIT:

I wanted to add that a recent comment on my blog pointed out that non-nullable types have a particular problem whenb used in Arrays. I also want to thank everyone for their useful insights. It is very helpful, sorry I could only choose one answer.

+4  A: 

The main obstruction I see to non-nullable reference types by default is that some portion of the programming community prefers the create-set-use pattern:

x = new Foo()
x.Prop <- someInitValue
x.DoSomething()

to overloaded constructors:

x = new Foo(someInitValue)
x.DoSomething()

and this leaves the API designer in a bind with regards to the initial value of instance variables that might otherwise be null.

Of course, like 'null' itself, the create-set-use pattern itself creates lots of meaningless object states and prevents useful invariants, so being rid of this is really a blessing rather than a curse. However it does affect a bit of API design in a way that many people will be unfamiliar with, so it's not something to do lightly.

But overall, yes, if there is a great cataclysm that destroys all existing languages and compilers, one can only hope that when we rebuild we will not repeat this particular mistake. Nullability is the exception, not the rule!

Brian
A non-nullable x.Prop should be set to a meaningful default or empty value. The consumer of the API can then override it as before, but know from the type that they can't set it to null.
I think a non-nullable x.Prop should be guaranteed to be initialized before it is used. If you need an "uninitialized" state, then that's what null is for, and perhaps x.Prop should actually be nullable.
apollodude217
A: 

Null is only a problem because developers don't check that something is valid before using it, but, if people start to misuse the new nullable construct it will not have solved any real problems.

It is important to just check that every variable that can be null is checked before it is used, and if this means that you have to use annotations to allow bypassing the check then that may make sense, otherwise the compiler could fail to compile until you check.

We put more and more logic into compilers to protect developers from themselves, which is scary and very sad, as we know what we should do, and yet sometimes skip steps.

So, your solution will also be subject to abuse, and we will be back to where we started, unfortunately.

UPDATE:

Based on some comments here was a theme in my answers. I guess I should have been more explicit in my original answer:

Basically, if the goal is to limit the impact of null variables, then have the compiler throw an error whenever a variable is not checked for null, and if you want to assume that it will never be null, then require an annotation to skip the check. This way you give people the ability to assume, but you also make it easy to find all the places in the code that have the assumption, and in a code review it can be evaluated if the assumption is valid.

This will help to protect while not limiting the developer, but making it easy to know where it is assumed not to be null.

I believe we need flexibility, and I would rather have the compilation take longer than have something negatively impact the runtime, and I think my solution would do what is desired.

James Black
Your third paragraph boggles my mind. You want to avoid automated solutions to common problems because humans need the extra work to promote the ethic of not 'skipping steps'?
Brian
Yeah, I have to agree with Brian here. More work is put into compilers to make developers more productive. Pragmatism not idealism.
Jason
@Brian - I suggested that we could force the developer to state that some variable may be null, and won't be checked, as we assume it won't be null, but that is as far as I would go, as that is hard to abuse, and if it is abused, it is easy to write a tool that checks for all the non-checks and decide if they are valid, in a code review, for example.
James Black
@Jason - That is fine, if we put something in the compiler, we should make it easy to know what the intent was, so it can be checked. But, there are so many ways to put in security holes, developers need to be more cautious, otherwise we may as well ban dynamic sql queries and force all queries to use prepared statements.
James Black
@James Black: The compiler is an implementation detail.
Jason
@barkmadley - I find dynamic SQL to be useful, but that is when I am using nothing from the user in the query, esp if there is no parameters to pass in. The Haskell solution was interesting, btw.
James Black
http://blog.moertel.com/articles/2006/10/18/a-type-based-solution-to-the-strings-problem dynamic SQL bad...deleted my other comment
barkmadley
@Jason - Just don't put something into the language that can be abused as the null pointer is currently. That, to me, is the most pragmatic solution, as then someone can look to see where the developer is assuming that it can't be null, and verify that the assumption is valid.
James Black
@James Black: It's nearly impossible to prevent language abuse. There is no construct nor tool that can save bad programmers from themselves (short of hacking off their fingers). Code contracts, however, are a really beautiful idea.
Jason
@James Black: If the compiler keeps a value from being null, that is essentially using a non-nullable type, except devs can still change it to null later (assuming the reference is settable). I _want_ the compiler to make my life easier by guaranteeing against this so that the value _is not null_, not just "assumed not to be null".
apollodude217
@apollodude217 - If you use a string, for example, what is the value when it is not set? It is in an unknown state, as you have never explicitly set it, so, null is as good a value as any, otherwise you are assuming what it is. That is a problem found in some languages where pointers in debug mode may be set to zero, but in release are not set at all, so may point anywhere. So, if you don't want it to be in an unknown state, just set it when you declare it.
James Black
@James Black: You already proposed a compiler feature for ensuring that values are initialized to something before they are accessed, thus removing the need for null in these cases. These are the cases where nullability is a nuisance (imao). If the variable may represent nothing or be uninitialized, then it should be of a nullable type.
apollodude217
@apollodude217 - Any solution will have trade-offs between protecting the programmer from stupid mistakes and annoying the programmer. I tend to opt for letting the programmer create problems if he chooses, but try to ensure that in a non-intrusive way you try to protect the programmer.
James Black
+4  A: 

I like the Ocaml way of dealing with the 'maybe null' issue. Whenever a value of type 'a might be unknown/undefined/unitialized, it is wrapped in an 'a Option type, which can be either None or Some x, where x is the actual non-nullable value. When accessing the x you need to use the matching mechanism for unwrapping. Here is a function that increases a nullable integer and returns 0 on None

>>> let f = function  Some x -> x+1 | None->0 ;;
val f : int option -> int = <fun>

How it works:

>>> f Some 5 ;;
- : int = 6
>>> f None ;;
- : int = 0

The matching mechanism sort of forces you to consider the None case. Here's what happens when you forget it:

 >>> let f = function  Some x -> x+1 ;;
 Characters 8-31:
 let f = function  Some x -> x+1 ;;
         ^^^^^^^^^^^^^^^^^^^^^^^
 Warning P: this pattern-matching is not exhaustive.
 Here is an example of a value that is not matched:
 None
 val f : int option -> int = <fun>

(This is just a warning, not an error. Now if you pass None to the function you'll get a matching exception.)

The variant types + matching is a generic mechanism, it also works for things like matching a list with head :: tail only (forgetting the empty list case).

Rafał Dowgird
+1 F# supports option types as well, but unfortunatelly it still has to take into account Nulls from .NET framework.
Nemanja Trifunovic
+1  A: 

Even better, disable null references. In rare cases when "nothing" is a valid value, there could be an object state that corresponds to it, but a reference would still point to that object, not have a zero value.

Nemanja Trifunovic
Is this the Null Object pattern? http://en.wikipedia.org/wiki/Null_Object_pattern
apollodude217
I did not have the Null patern in mind, but option types: http://en.wikipedia.org/wiki/Option_type
Nemanja Trifunovic
+1  A: 

As I understand, Martin Odersky's rationale for including null in Scala is to easily use Java libraries (i.e. so all your api's don't appear to have, e.g., "Object?" all over the place):

http://www.artima.com/scalazine/articles/goals_of_scala.html

Ideally, I think null should be included in the language as a feature, but non-nullable should be the default for all types. It would save lots of time and prevent errors.

apollodude217
A: 

No.

The state of uninitialized will exist in some fashion due to logical necessity; currently the denotation is null.

Perhaps a "valid but uninitialized" concept of an object can be designed in, but how is that significantly different? The semantics of "accessing an uninitialized object" still will exist.

A better route is to have a static-time checking that you do not access an object that is not assigned to(I can't think of something off the top of my head that would prevent that besides string evals).

Paul Nathan
The point I was trying to make was that the need of an "uninitialized" or "sentry value" for a variable or argument is the exception rather than the rule. When it is needed, I see no reason not to use "null", and the nullable form (e.g. "MyType?").
cdiggins
"static-time checking that you do not access an object that is not assigned to" gurantees initialization, but does not guarantee against null (i.e. it can still be initialized or later changed to null). These are separate problems. What if you have a method parameter that should never be null? A static check to make sure that value is not null _is_ using a non-nullable type. Checking for null manually and dealing with NullReferenceException are both painful.
apollodude217
No, types *can* be used. They don't have to be...
Paul Nathan