views:

2877

answers:

21

We noticed that lots of bugs in our software developed in C# (or java) cause a NullReferenceException.

Is there a reason why "null" has even been included in the language?

After all, if there were no "null", I would have no bug, right?

In other words, what feature in the language couldn't work without null?

+1  A: 

I can't speak to your specific issue, but it sounds like the problem isn't the existence of null. Null exists in databases, you need some way to account for that in the application level. I don't think that's the only reason it exists in .net, mind you. But I figure it's one of the reasons.

peacedog
-1. In .NET, a database `NULL` is usually represented as `DBNull.Value` instead of `null`.
stakx
+10  A: 

Null is an extremely powerful feature. What do you do if you have an absence of a value? Its NULL!

One school of thought is to never return null, another is to always. For example, some say you should return a valid but empty object.

I prefer null as to me its a truer indication of what it actually is. If i cant retreive an entity from my persistance layer, I want null. I dont want some empty value. But thats me.

It is especially handy with primitives. For example, if I have true or false, but its used on a security form, where a permission can be Allow, Deny, or not set. Well, I want that Not Set to be null. So I can use bool?

Lots more i could go on about but i will leave it there.

mattlant
What you want in the Allow/Deny/NotSet situation is an enumeration, not something that includes null.
Mark Cidade
False, True, FileNotFound. :-)
Chris Jester-Young
Allow = True, Deny = False, Not Set = No Value hence NULL. That was one simple example. Yes, No, No Data, True False Undecided, etc. List can go on and on, its a perfect fit for nullable bool rather than an enum. If I had Allow/Deny/IDontKNow/WhoCares/IAmHungry/NotSet, then ya i would use enum.
mattlant
You'll find that null is possibly the worst way to build a 'no-information' element into data.
jlouis
jlouis - can you elaborate?
Richard Ev
In the example, what if there's a bug in the code and the programmer just forgot to assign a value to the Allow/Deny flag? In this case, the value of null is a mistake, it doesn't mean 'NOT SET'.
Outlaw Programmer
Downvoted. Null is really 2 concepts/features: being able to represent invalid program states (useless imo), being able to represent optionality (mandatory). As such it's not powerfull, it's stupid.
bltxd
A: 

Null is an essential requirement of any OO language. Any object variable that hasn't been assigned an object reference has to be null.

David Arno
You really should learn more OO languages before making statements like that...
nikie
So nikie, would you care to explain to this ignorant old fool just what is wrong with my statement?
David Arno
@David: basically null references are not an *essential* *requirement* of any OO language. One solution: require initialization of everything upon declaration/construction and use the null object pattern for anything you need to have null/undefined/invalid/default/empty/whatever values.
Martinho Fernandes
+23  A: 

Nullity is a natural consequence of reference types. If you have a reference, it has to refer to some object - or be null. If you were to prohibit nullity, you would always have to make sure that every variable was initialized with some non-null expression - and even then you'd have issues if variables were read during the initialization phase.

How would you propose removing the concept of nullity?

Jon Skeet
Its not fundamental. You could require any ref type variable to be initialized just as with (non-nullable) value types. It would be a hassle though.
JacquesB
Suppose you require any ref type variable to be initialized. Suppose this is a static field, and you call a static method to initialize the field - what if that method looks at another field which hasn't been initialized yet? What value would it have?
Jon Skeet
In short: there are probably ways around it, but they would become increasingly labyrinthine as you looked at edge cases.
Jon Skeet
In pure functional programming, values can only be specified via function parameters and expression evaluations (including function calls), which all need to be specified before they're used and only exist for the scope their defined unless captured in a closure.
Mark Cidade
@marxidad: Yup, so I can understand functional languages getting away without null. I don't think null could be removed from C# without radically changing the language though.
Jon Skeet
You could remove null from the subset of types which are capable of being handled by literals (albeit still with hassles) but realistically the prime c# candidate the string would simply end up being the empty string rather than null, hardly a massive win. The *option* to dot it would be nice though
ShuggyCoUk
_@Jon:_ As I currently see it, you could declare a constant for any reference type, `X.Exmpty` (à la `string.Empty` or `EventArgs.Empty`), which would take the place of null references for that type. So I think C# could have potentially lived without `null`.
stakx
@stakx: So what would you happen if you tried to read from FileStream.Empty, for example?
Jon Skeet
_@Jon:_ Unless one can think of something better to do, possibly an exception would be thrown. I didn't so much mean to say that you'd *have* to do things differently without `null`, but instead that `null` *can* be replaced by a type-specific default value. That by itself isn't much of an improvement, but demonstrates that `null` *could* be abolished. If you can then think of a more reasonable thing to return than throw an exception when reading from a hypothetical `FileStream.Empty`, *that* would be the real improvement!
stakx
@stakx: It feels like they're very *very* similar to me. You might as well just allow methods to be called with a null "this" and react appropriately... they're basically equivalent. (In fact, you *can* do that in IL - you can call an instance method non-virtually with a null "this".)
Jon Skeet
_@Jon,_ I remember you demonstrating the null `this` call (but can't remember where exactly?; +1 for that, anyway). Agreed that the two scenarios are almost identical. What `null` amounts to from this perspective is a convenience: You don't have to think twice about how you could handle the "no object" case more reasonably. `FileStream.Empty` *might* or might not provide the nudge to find better solutions.
stakx
@Jon, there are 2 concepts under the name "nullity": uninitialized and optional. This is what makes it are carried-over mistake from Algol.
bltxd
@Jon, uninitialized should not be needed, it's akin to requesting to be able to represent invalid program states.
bltxd
+3  A: 

"Null" is included in the language because we have value types and reference types. It's probably a side effect, but a good one I think. It gives us a lot of power over how we manage memory effectively.

Why we have null? ...

Value types are stored on the "stack", their value sits directly in that piece of memory (i.e. int x = 5 means that that the memory location for that variable contains "5").

Reference types on the other hand have a "pointer" on the stack pointing to the actual value on the heap (i.e. string x = "ello" means that the memory block on the stack only contains an address pointing to the actual value on the heap).

A null value simply means that our value on the stack does not point to any actual value on the heap - it's an empty pointer.

Hope I explained that well enough.

willem
Downvoted. See option types in languages of the ML family. In short, by making references non-nullable you don't lose any generality, you simply make your programs NullReferenceException free.
bltxd
+12  A: 

Removing null wouldn't solve much. You would need to have a default reference for most variables that is set on init. Instead of null-reference exceptions you would get unexpected behaviour because the variable is pointing to the wrong objects. At least null-references fail fast instead of causing unexpected behaviour.

You can look at the null-object pattern for a way to solve part of this problem

Mendelt
I call this nonsense. The introduction of a non-nullable reference type would let you choose which one you need. OCaml and Haskell have non-nullable references and yet are applicable to all problem domains and don't suffer from NullReferenceException at all...
bltxd
+65  A: 

Anders Hejlsberg, "C# father", just spoke about that point in his Computerworld interview:

For example, in the type system we do not have separation between value and reference types and nullability of types. This may sound a little wonky or a little technical, but in C# reference types can be null, such as strings, but value types cannot be null. It sure would be nice to have had non-nullable reference types, so you could declare that ‘this string can never be null, and I want you compiler to check that I can never hit a null pointer here’.

50% of the bugs that people run into today, coding with C# in our platform, and the same is true of Java for that matter, are probably null reference exceptions. If we had had a stronger type system that would allow you to say that ‘this parameter may never be null, and you compiler please check that at every call, by doing static analysis of the code’. Then we could have stamped out classes of bugs.

Cyrus Najmabadi, a former software design engineer on the C# team (now working at Google) discuss on that subject on his blog: (1st, 2nd, 3rd, 4th). It seems that the biggest hindrance to the adoption of non-nullable types is that notation would disturb programmers’ habits and code base. Something like 70% of references of C# programs are likely to end-up as non-nullable ones.

If you really want to have non-nullable reference type in C# you should try to use Spec# which is a C# extension that allow the use of "!" as a non-nullable sign.

static string AcceptNotNullObject(object! s)
{
    return s.ToString();
}
madgnome
Strangely enough, I blogged about a dirty hack to implement some of this just last night:http://msmvps.com/blogs/jon_skeet/archive/2008/10/06/non-nullable-reference-types.aspxIt's really not a nice solution, with various gotchas, but interesting nonetheless (IMO!).
Jon Skeet
Actually all types should be implicit non-nullable, even reference types. Nullable types (value types as well as reference types) should be explicitly marked with "?".
Michael Damatov
Small correction: Cyrus Najmabadi is no longer on the C# team. I didn't realise this until I got an internal email from him today, here at Google :)
Jon Skeet
I was wondering where he'd moved to :)
ShuggyCoUk
+14  A: 

Null in C# is mostly a carry-over from C++, which had pointers that didn't point to anything in memory (or rather, adress 0x00). In this interview, Anders Hejlsberg says that he would've like to have added non-nullable reference types in C#.

Null also has a legitimate place in a type system, however, as something akin to the bottom type (where object is the top type). In lisp, the bottom type is NIL and in Scala it is Nothing.

It would've been possible to design C# without any nulls but then you'd have to come up with an acceptable solution for the usages that people usually have for null, such as unitialized-value, not-found, default-value, undefined-value, and None<T>. There would've probably been less adoption amongst C++ and Java programmers if they did succeed in that anyhow. At least until they saw that C# programs never had any null pointer exceptions.

Mark Cidade
Not to mention that all you'd really be doing is renaming null.
Telos
You don't have to replace the "null" keyword with another single keyword. You can constrain what "null" means and then add other keywords to mean different things. This can reduce null reference errors. Even JavaScript has "undefined" in addition to "null". VB6 had an "Empty", "Nothing" AND "Null"
Mark Cidade
Downvoted. You are confused between the 2 usages of null: "uninitialized" is useless (to quote jalf "meaningless states should not be representable") and distinct from "optionality" which is required.
bltxd
A: 

If you create an object with an instance variable being a reference to some object, what value would you suggest has this variable before you assigned any object reference to it?

Mecki
Lots of options. For example, the compiler could enforce that variables can't be used before they have been explicitly initialized. Or the compiler could use a default (parameterless) constructor and issue an error if no such constructor is found (like in C++)
nikie
+1  A: 

There are situations in which null is a nice way to signify that a reference has not been initialized. This is important in som scenarios.

For instance:

MyResource resource;
try
{
  resource = new MyResource();
  //
  // Do some work
  //
}
finally
{
  if (resource != null)
    resource.Close();
}

This is in most cases accomplished by the use of a using statement. But the pattern is still widely used.

With regards to your NullReferenceException, the cause of such errors are often easy to reduce by implementing a coding standard where all parameters a checked for validity. Depending on the nature of the project I find that in most cases it's enough to check parameters on exposed members. If the parameters are not within the expected range an ArgumentException of some kind is thrown, or a error result is returned, depending on the error handling pattern in use.

The parameter checking does not in it self remove bugs, but any bugs that occur are easier to locate and correct during the testing phase.

As a note, Anders Hejlsberg has mentioned the lack of non-null enforcement as one of the biggest mistakes in the C# 1.0 spec and that including it now is "difficult".

If you still think that a statically enforced non-null reference value is of great importance you could check out the spec# language. It is an extension of C# where non-null references are part of the language. This ensures that a reference marked as non-null can never be assigned a null reference.

Kimoz
+1  A: 

Null as it is available in C#/C++/Java/Ruby is best seen as an oddity of some obscure past (Algol) that somehow survived to this day.

You use it two ways :

  1. to declare references without initializing them (bad).
  2. to denote optionality (ok).

As you guessed, 1) is what causes us endless trouble in common imperative languages and should have been banned long ago, 2) is the true essential feature.

There are languages out there that avoid 1) without preventing 2).

For example OCaml is such a language.

A simple function returning an ever incrementing integer starting from 1:

let counter = ref 0;;
let next_counter_value () = (counter := !counter + 1; !counter);;

And regarding optionality:

type distributed_computation_result = NotYetAvailable | Result of float;;
let print_result r = match r with
    | Result(f) -> Printf.printf "result is %f\n" f
    | NotYetAvailable -> Printf.printf "result not yet available\n";;
bltxd
1 is bad? Always, you say? What about in exception handling, when you want to see the value of a variable in the catch block? Pretty much need to declare it outside the try, and initialize it inside if the initialization could cause the exception!
Telos
If your variable construction failed, its value is null.
bltxd
At least in Java, you have to declare references without initializing them: non-primitive fields of classes (initialized in constructor), objects that throw an exception at creation time that you need to use outside of the corresponding try/catch block, etc.
PhiLho
@blue--And if there were no "null" value, there would be no way to indicate failure would there? Because without a null value, the variable would be garbage and it's not always possible to distinguish a non-null garbage value from a good value.
Onorio Catenacci
@PhiLho: An object that threw an exception at construction time is never created. Of course, to use it afterwards you must use null, but you could also track its successful initialization with a boolean. As In essence, your logic does not require null, but it makes use of it.
bltxd
@Onorio: Error handling is another matter and OCaml provides (fast) exceptions.
bltxd
You're oversimplifying. Sure, it could have thrown on initialization, OR on the next statement in which case the variable would have meaningful info. Unless you really want to have a try/catch for each line of code, you'll be declaring several uninitialized variables to consume in the catch.
Telos
@Telos: I would be oversimlpifying if database bindings for OCaml weren't fully implemented and workable. Being skeptical is ok, but give it a try and you'll see for yourself. The language works differently, and this is no surprise that the code you'll have to write will also be different.
bltxd
@Telos: have a look at F# (Microsoft's port of OCaml on .NET) and more precisely at http://research.microsoft.com/fsharp/manual/spec2.aspx#_Toc207785595This page details reason why they had to introduce null in F#: technical limitations in .NET and compatibility with existing APIs.
bltxd
@Telos: They had no choice since they could'nt revamp their legacy APIs and internals. So what shoud we conclude ?
bltxd
@Telos: That removing null is unworkable when living in a C world ? All the existing OCaml bindings for C libraries prove otherwise.
bltxd
+1  A: 

If you're getting a 'NullReferenceException', perhaps you keep referring to objects which no longer exist. This is not an issue with 'null', it's an issue with your code pointing to non-existent addresses.

Glitch
A: 

I'm surprised no one has talked about databases for their answer. Databases have nullable fields, and any language which will be receiving data from a DB needs to handle that. That means having a null value.

In fact, this is so important that for basic types like int you can make them nullable!

Also consider return values from functions, what if you wanted to have a function divide a couple numbers and the denominator could be 0? The only "correct" answer in such a case would be null. (I know, in such a simple example an exception would likely be a better option... but there can be situations where all values are correct but valid data can produce an invalid or uncalculable answer. Not sure an exception should be used in such cases...)

Telos
Handling DB stuff means having null *values*. The big difference is that in C# you have null *references*. You division example where you don't throw an exception already exists (IEEE 754 floating point arithmetic) and it proves the "only" correct answer is not null references. IEEE 754 uses a null *value* (NaN - Not a Number).
Martinho Fernandes
+2  A: 

One response mentioned that there are nulls in databases. That's true, but they are very different from nulls in C#.

In C#, nulls are markers for a reference that doesn't refer to anything.

In databases, nulls are markers for value cells that don't contain a value. By value cells, I generally mean the intersection of a row and a column in a table, but the concept of value cells could be extended beyond tables.

The difference between the two seems trivial, at first clance. But it's not.

Walter Mitty
+1 for pointing the difference between null *references* and *values* meaning something is null/invalid/default/empty/whatever.
Martinho Fernandes
+7  A: 

After all, if there were no "null", I would have no bug, right?

The answer is NO. The problem is not that c# allows null, the problem is that you have bugs which happen to manifest themselves with the NullReferenceException. As has been stated already, nulls have a purpose in the language to indicate either an "empty" reference type, or a non-value (empty/nothing/unkown).

pro3carp3
Removing null from the language would reduce probability of certain classes of bugs in the same way that removing pointer types reduces probability of certain classes of bugs. If you realy needed null, you could use an unsafe block. Value types exist just fine without being nullable.
Mark Cidade
The point that pro3carp3 is trying to make is that the majority of null bugs are related to using uninitialized variables as if they were prepared for use. Non-nullable types really only solve the case of "I don't care what the value is I just want to run this method", while usually you do care.
Guvante
Downvoted. Your premise is that you need nulls to represent uninitialized refs. This is true if you really insist to allow uninitialized refs in your language. But the real question is whether this makes senses and the answer is clearly no. This is akin to require to be able to represent invalid program states.
bltxd
+3  A: 

Null do not cause NullPointerExceptions...

Programers cause NullPointerExceptions.

Without nulls we are back to using an actual arbitrary value to determine that the return value of a function or method was invalid. You still have to check for the returned -1 (or whatever), removing nulls will not magically solve lazyness, but mearly obfuscate it.

Newtopian
Downvoted. Try any language of the ML family and get enlightened.
bltxd
A: 

I propose:

  1. Ban Null
  2. Extend Booleans: True, False and FileNotFound
Sadly, I don't think that most people will get the joke.
Rob
A: 

Commonly - NullReferenceException means that some method didn't like what it was handed and returned a null reference, which was later used without checking the reference before use.

That method could have thown some more detailed exception instead of returning null, which complies with the fail fast mode of thinking.

Or the method might be returning null as a convenience to you, so that you can write if instead of try and avoid the "overhead" of an exception.

David B
A: 

Besides ALL of the reasons already mentioned, NULL is needed when you need a placeholder for a not-yet created object. For example. if you have a circular reference between a pair of objects, then you need null since you cannot instantiate both simultaneously.

class A {
  B fieldb;
}

class B {
  A fielda;
}

A a = new A() // a.fieldb is null
B b = new B() { fielda = a } // b.fielda isnt
a.fieldb = b // now it isnt null anymore

Edit: You may be able to pull out a language that works without nulls, but it will definitely not be an object oriented language. For example, prolog doesn't have null values.

Santiago Palladino
Pick an existing OO language like C#. Now require initialization of everything upon declaration/construction and use the null object pattern for anything you need to have null/undefined/invalid/default/empty/whatever values.
Martinho Fernandes
+1  A: 

The question may be interpreted as "Is it better to have a default value for each referance type (like String.Empty) or null?". In this prespective I would prefer to have nulls, because;

  • I would not like to write a default constructor for each class I write.
  • I would not like some unneccessary memory to be allocated for such default values.
  • Checking whether a referance is null is rather cheaper than value comparisons.
  • It is highly possible to have more bugs that are harder to detect, instead of NullReferanceExceptions. It is a good thing to have such an exception which clearly indicates that I am doing (assuming) something wrong.
tafa
+10  A: 

Like many things in object-oriented programming, it all goes back to ALGOL. Tony Hoare just called it his "billion-dollar mistake." If anything, that's an understatement.

Here is a really interesting thesis on how to make nullability not the default in Java. The parallels to C# are obvious.

Craig Stuntz
I wish I could upvote twice...
nikie