views:

1829

answers:

27

I've often wondered why languages with a null representing "no value" don't differentiate between the passive "I don't know what the value is" and the more assertive "There is no value.".

There have been several cases where I'd have liked to differentiate between the two (especially when working with user-input and databases).

I imagine the following, where we name the two states unknown and null:

var apple;

while (apple is unknown)
{
    askForApple();
}

if (apple is null)
{
    sulk();
}
else
{
    eatApple(apple);
}

Obviously, we can get away without it by manually storing the state somwhere else, but we can do that for nulls too.

So, if we can have one null, why can't we have two?

+2  A: 

In .net langages, you can use nullable types, which address this problem for value types.

The problem remains, however, for reference types. Since there's no such thing as pointers in .net (at least in 'safe' blocks), "object? o" won't compile.

Brann
Only value types can be nullable, so it is not a second null for reference types.It only introduces an "undefined" state for value types, which otherwise would always have a value.
Timbo
You're right. I updated my post accordingly. Thanks.
Brann
"Nullable types are instances of the System..::.Nullable<(Of <(T>)>) struct. A nullable type can represent the correct range of values for its underlying value type, plus an additional null value." So it's like a Haskell "Maybe", but for primitives only?
13ren
+33  A: 

In most programming languages null means "empty" or "undefined". "Unknown" on the other hand is something different. In essence "unknown" describes the state of the object. This state must have come from somewhere in your program.

Have a look at the Null Object pattern. It may help you with what you are trying to achieve.

kgiannakakis
+10  A: 

javascript actually has both null and undefined (http://www.w3schools.com/jsref/jsref_undefined.asp), but many other languages don't.

tehvan
Indeed. Previous experience with Javascript is one of the things which has caused me to wonder.
A J Lane
If you have a dynamically typed language you need both - undefined = type unknown and nothing in it, null = reference type pointing to nothing. In a statically typed language like C# you always have a type, so undefined is not needed.
Keith
+3  A: 

Note null is an acceptable, yet known condition. An unknown state is a different thing IMO. My conversation with Dan in the comments' section of the top post will clarify my position. Thanks Dan!.

What you probably want to query is whether the object was initialized or not.

Actionscript has such a thing (null and undefined). With some restrictions however.

See documentation:

void data type

The void data type contains only one value, undefined. In previous versions of ActionScript, undefined was the default value for instances of the Object class. In ActionScript 3.0, the default value for Object instances is null. If you attempt to assign the value undefined to an instance of the Object class, Flash Player or Adobe AIR will convert the value to null. You can only assign a value of undefined to variables that are untyped. Untyped variables are variables that either lack any type annotation, or use the asterisk (*) symbol for type annotation. You can use void only as a return type annotation.

dirkgently
You're welcome :-)
Dan
+33  A: 

Isn't is bad enough that we have one null?

Avi
Why do you say that?
A J Lane
Because null isn't a proper value of a type. If I have a variable of type Car, I want to be able to call car.drive() without worrying that it might not actually be a car at all. Other state should be in other variables.
Avi
why keep 2 variables for the same thing? that would be very messy
tehvan
I like languages that guarantee that a variable will never be null. Unfortunately, I cannot think of one offhand :-( though they definitely exist.
Dan
Daniel Daranas
@Avi: The question is not about having two nulls, as I said, it is about having a null and a value that is not known. They are IMHO two completely different things.
dirkgently
-1: the OP doesn't want to two NULLs. Check.
dirkgently
@Daniel: Found the language I was thinking of: Delight http://delight.sourceforge.net/ Objects are treated like in D or Java (ie you are passing around references) but they can never be null, unless you say so. I guess this is the same as using references in C++, like you said.
Dan
@AJ, dirkgently: Why would a variable ever be unknown? Uninitialized, sure. Empty, sure. Invalid, sure. But unknown?
Dan
@Dan: What is the value of an uninitialized variable?
dirkgently
@dirkgently: There is none. Its just a fact of programming that if you don't initialize a variable, then its left uninitialized. Yes, they can be automatically initialized by the compiler, but thats besides the point.
Dan
@Dan: I see it like this: an uninited variable has a state that cannot be relied upon which **is** distinct from a known and reliable non-state (i.e. null).YMMV.
dirkgently
@dirkgently: I think the two states therefore are: reliable non-state (null) and unreliable non-state (unknown, though IMHO a better name would be required).
Dan
@Dan: Null is a non-state -- agreed. But the other one, since an object is in existence, I prefer to call it a state which is unreliable.
dirkgently
@dirkgently: an uninitialized variable doesn't really represent an object which is in existence though, but rather a variable which does not yet contain/point to/reference an object - which is your non-state - null.
Dan
I guess perhaps we are in disagreement over what it means for an object to be null.. Oh well :-)
Dan
-1 I agree NullPointExceptions can be annoying but there are plenty of valid reasons why it exists.
Outlaw Programmer
There's many conditions when undef, null, doesn't have a value is important and different than an uninitialized value.
Robert P
-1: null is a good concept. there are plenty of good examples
Johannes Schaub - litb
+1: null is a bad concept. There are plenty of examples. Rather easy to invert ;) One language that doesn't have null is Haskell, where one can easily make a type that either holds a value or doesn't (it's built-in). It alleviates null errors without limiting expressiveness (in my experience).
Gracenotes
Is this relevant? http://www.qconlondon.com/london-2009/presentation/Null+References:+The+Billion+Dollar+Mistake "I call it my billion-dollar mistake. It was the invention of the null reference in 1965."
MichaelGG
Gracenotes, of course i was talking about imperative languages and not about functional languages. his example showed that he aims to talk about imperative languages. for haskell, there is no need for a null.
Johannes Schaub - litb
Gracenotes, however when i think about it, it makes sense to me that one has a "Optional String s" which can be null, and "String s" which can impossibly be null. maybe it's just terminology, but i call this not-a-member-of-all-types-null null too. probably i'm wrong on that use :)
Johannes Schaub - litb
+1 The number of bugs just because an empty string or a null pointer are confused is huge. Can't imagine the pain with a tri-state string.
Andomar
+29  A: 

In my programming, I recently adopted the practice of differentiating "language null" and "domain null".

The "language null" is the special value that is provided by the programming language to express that a variable has "no value". It is needed as dummy value in data structures, parameter lists, and return values.

The "domain null" is any number of objects that implement the NullObject design pattern. Actually, you have one distinct domain null for each domain context.

It is fairly common for programmers to use the language null as a catch-all domain null, but I have found that it tends to make code more procedural (less object oriented) and the intent harder to discern.

Every time to want a null, ask yourself: is that a language null, or a domain null?

ddaa
Accepted for making the domain distinction. I'd rather never have to think about "language nulls" at all, but it seems they have become a necessary evil.
A J Lane
+10  A: 

It would be easy enough to create a static constant indicating unknown, for the rare cases when you'd need such a thing.

var apple = Apple.Unknown;
while (apple == Apple.Unknown) {} // etc
marijne
or `Apple.None` and use the language defined null for unknown...
Rowland Shaw
+1  A: 

Some people will argue that we should be rid of null altogether, which seems fairly valid. After all, why stop at two nulls? Why not three or four and so on, each representing a "no value" state?

Imagine this, with refused, null, invalid:

var apple;

while (apple is refused)
{
    askForApple();
}

if (apple is null)
{
    sulk();
}
else if(apple is invalid)
{
    discard();
}
else
{
    eatApple(apple);
}
A J Lane
That would be quite nice. Hmm... Subclasses of null?
TraumaPony
One of those "Some People" is the inventor of NULL who just a couple of weeks ago called it "The Billion Dollar Mistake" in his talk in the "Historically Bad Ideas" track of QCon London 2009: <http://QConLondon.Com/london-2009/presentation/Null+References:+The+Billion+Dollar+Mistake>
Jörg W Mittag
//missing semicolon :oP => discard();
demoncodemonkey
the > is being encoded as > before the hyperlink detection even gets to it. Think it through. Is that a SOLID principle, thinking? :)
Jeff Atwood
+6  A: 

Existence of value:

  • Python: vars().has_key('variableName')
  • PHP: isset(variable)
  • JavaScript: typeof(variable) != 'undefined'
  • Perl: (variable != undef) or if you wish: (defined variable)

Of course, when variable is undefined, it's not NULL

vartec
or in a perl hash, if (exists $table{key} ) to see if there's even a key, let alone to see if it has a value.
Robert P
+2  A: 

Within PHP Strict you need to do an isset() check for set variables (or else it throws a warning)

if(!isset($apple))
{
    askForApple();
}

if(isset($apple) && empty($apple))
{
    sulk();
}
else
{
    eatApple();
}
Ólafur Waage
+3  A: 

The Null type is a subtype of all reference types - you can use null instead of a reference to any type of object - which severely weakens the type system. It is considered one of the a historically bad idea by its creator, and only exists as checking whether an address is zero is easy to implement.

Pete Kirkham
+1  A: 

As to why we don't have two nulls, is it down to the fact that, historically in C, NULL was a simple #define and not a distinct part of the language at all?

saw-lau
A: 

I think having one NULL is a lower-common denominator to deal with the basic pattern

if thing is not NULL
  work with it
else
  do something else

In the "do something else" part, there are a wide range of possibilities from "okay, forget it" to trying to get "thing" somewhere else. If you don't simply ignore something that's NULL, you probably need to know why "thing" was NULL. Having multiple types of NULL, would help you answering that question, but the possible answers are numerous as hinted in the other answers here. The missing thing could be simply a bug, it could be an error when trying to get it, it may not be available right now, and so on. To decide which cases apply to your code -- which means you have to handle them -- it domain specific. So it's better to use an application defined mechanism to encode these reasons instead of finding a language feature that tries to deal with all of them.

jgre
+2  A: 

The problem is that in a strongly typed language these extra nulls are expected to hold specific information about the type.

Basically your extra null is meta-information of a sort, meta-information that can depend on type.

Some value types have this extra information, for instance many numeric types have a NaN constant.

In a dynamically typed language you have to account for the difference between a reference without a value (null) and a variable where the type could be anything (unknown or undefined)

So, for instance, in statically typed C# a variable of type String can be null, because it is a reference type. A variable of type Int32 cannot, because it is a value type it cannot be null. We always know the type.

In dynamically typed Javascript a variable's type can be left undefined, in which case the distinction between a null reference and an undefined value is needed.

Keith
A: 

Some people are one step ahead of you already. ;)

Quibblesome
Take a look at nullable types in .Net for this - it's a far better way than assigning an arbitrary value to null.
Keith
You realise i'm joking.. right? It's a daily WTF link. I'd never recommend this or what the OP is asking for.
Quibblesome
I didn't, sorry. Sometimes I think us programmers are just too sarcastic for our own good :-/
Keith
A: 

It's because Null is an artifact of the language you're using, not a programmer convenience. It describes a naturally occurring state of the object in the context in which it is being used.

Mark Ransom
+3  A: 

Why stop at two?

When I took databases in college, we were told that somebody (sorry, don't remember the name of the researcher or paper) had looked at a bunch of db schemas and found that null had something like 17 different meanings: "don't know yet", "can't be known", "doesn't apply", "none", "empty", "action not taken", "field not used", and so on.

Ken
CJ Date for one: http://www.firstsql.com/inulls5.htm
MarkJ
+2  A: 

In haskell you can define something like this:

data MaybeEither a b = Object a
                     | Unknown b
                     | Null
                       deriving Eq
main = let x = Object 5 in
       if x == (Unknown [2]) then putStrLn ":-("
       else putStrLn ":-)"

The idea being that Unknown values hold some data of type b that can transform them into known values (how you'd do that depends on the concrete types a and b).

The observant reader will note that I'm just combining Maybe and Either into one data type :)

Jonas Kölker
Ha! I posted something similar just as you did. I'm not that familiar with Haskell yet, but thanks for posting this.
Ryan Riley
A: 

If you are using .NET 3.0+ and need something else, you might try the Maybe Monad. You could create whatever "Maybe" types you need and, using LINQ syntax, process accordingly.

Ryan Riley
A: 
AppleInformation appleInfo;    
while (appleInfo is null)
{
    askForAppleInfo();
}

Apple apple = appleInfo.apple;
if (apple is null)
{
    sulk();
}
else
{
    eatApple(apple);
}

First you check if you have the apple info, later you check if there is an apple or not. You don't need different native language support for that, just use the right classes.

Daniel Daranas
+1  A: 
boolean getAnswer() throws Mu
13ren
A: 

For me null represents lack of value, and I try to use it only to represent that. Of course you can give null as many meanings as you like, just like you can use 0 or -1 to represent errors instead of their numerical values. Giving various meanings to one representation could be ambiguous though, so I wouldn't recommend it.

Your examples can be coded like apple.isRefused() or !apple.isValid() with little work; you should define beforehand what is an invalid apple anyway, so I don't see the gain of having more keywords.

XenF
+1  A: 

It's been tried: Visual Basic 6 had Nothing, Null, and Empty. And it led to such poor code, it featured at #12 in the legendary Thirteen Ways to Loathe VB article in Dr Dobbs.

Use the Null Object pattern instead, as others have suggested.

MarkJ
A: 

You can always create an object and assign it to same static field to get a 2nd null.

For example, this is used in collections that allow elements to be null. Internally they use a private static final Object UNSET = new Object which is used as unset value and thus allows you to store nulls in the collection. (As I recall, Java's collection framework calls this object TOMBSTONE instead of UNSET. Or was this Smalltalk's collection framework?)

Adrian
A: 

VB6

  • Nothing => "There is no value."
  • Null = > "I don't know what the value is" - Same as DBNull.Value in .NET
Sung Meister
Haha, nobody will read me!~
Sung Meister
A: 

Two nulls would be the wrongest answer around. If one null is not enough, you need infinity nulls.

Null Could mean:

  • 'Uninitialized'
  • 'User didn't specify'
  • 'Not Applicable here, The color of a car before it has been painted'
  • 'Unity: This domain has zero bits of information.'
  • 'Empty: this correctly holds no data in this case, for example the last time the tires were rotated on a new car'
  • 'Multiple, Cascading nulls: for instance, the extension of a quantity price when no quantity may be specified times a quantity which wasn't specified by the user anyway'

And your particular domain may need many other kinds of 'out of band' values. Really, these values are in the domain, and need to have a well defined meaning in each case. (ergo, infinity really is zero)

TokenMacGuy
A: 

Given how long it took Western philosophy to figure out how it was possible to talk about the concept of "nothing"... Yeah, I'm not too surprised the distinction got overlooked for a while.

Kim Reece