tags:

views:

981

answers:

10

[ This is a result of Best Practice: Should functions return null or an empty object? but I'm trying to be very general. ]

In a lot of legacy (um...production) C++ code that I've seen, there is a tendency to write a lot of NULL (or similar) checks to test pointers. Many of these get added near the end of a release cycle when adding a NULL-check provides a quick fix to a crash caused by the pointer dereference--and there isn't a lot of time to investigate.

To combat this, I started to write code that took a (const) reference parameter instead of the (much) more common technique of passing a pointer. No pointer, no desire to check for NULL (ignoring the corner case of actually having a null reference).

In C#, the same C++ "problem" is present: the desire to check every unknown reference against null (ArgumentNullException) and to quickly fix NullReferenceExceptions by adding a null check.

It seems to me, one way to prevent this is to avoid null objects in the first place by using empty objects (String.Empty, EventArgs.Empty) instead. Another would be to throw an exception rather than return null.

I'm just starting to learn F#, but it appears there are far fewer null objects in that enviroment. So maybe you don't really have to have a lot of null references floating around?

Am I barking up the wrong tree here?

+3  A: 

Null gets my vote. Then again, i'm of the 'fail-fast' mindset.

String.IsNullOrEmpty(...) is very helpful too, i guess it catches either situation: null or empty strings. You could write a similar function for all your classes you're passing around.

Chris
You have to program in C++ to enjoy references, I think you would change your mind for sure.
AraK
I do have a bit of c++ experience, and never really had any issues with this kindof problem. I seem to remember wrapping my use of MyRef with an 'if (MyRef)' whenever i wanted to simply skip some code if the reference was null, it wasn't much of an issue for me. Or am i missing something here?
Chris
Just to clarify, we're talking about c++ pointers here, when we're talking about 'references' ?
Chris
Yes, why do you have to write "if (pMyObj)" in the first place? That is, why is "pMyObj" NULL in this code? Yes, there can be times where such a thing is useful, but not at nearly every turn.
Dan
We're probably talking about C++ references, not pointers. And C++ references are never null.
sth
+1  A: 

I'd say it depends. For a method returning a single object, I'd generally return null. For a method returning a collection, I'd generally return an empty collection (non-null). These are more along the lines of guidelines than rules, though.

tvanfosson
I'd agree, but have returned null when creating/populating the collection **failed** and I didn't want to throw exceptions.
NVRAM
+1  A: 

The thing about null is that it doesn't come with meaning. It is merely the absence of an object.

So, if you really mean an empty string/collection/whatever, always return the relevant object and never null. If the language in question allows you to specify that, do so.

In the case where you want to return something that means not a value specifiable with the static type, then you have a number of options. Returning null is one answer, but without a meaning is a little dangerous. Throwing an exception may actually be what you mean. You might want to extend the type with special cases (probably with polymorphism, that is to say the Special Case Pattern (a special case of which is the Null Object Pattern)). You might want to wrap the return value in an type with more meaning. Or you might want to pass in a callback object. There usually are many choices.

Tom Hawtin - tackline
+1 for "might" and "may". But null can have meaning *contextually*.
NVRAM
Well, you can ascribe `null` meaning. You might even document it in comments to some degree. But there is not inherent meaning there. Indeed the implementation code might take it to mean the exact opposite of what the programmer of client code thinks it means.
Tom Hawtin - tackline
A: 

If you are serious about wanting to program in a "nullless" environment, consider using extension methods more often, they are immune to NullReferenceExceptions and at least "pretend" that null isn't there anymore:

public static GetExtension(this string s)
{
    return (new FileInfo(s ?? "")).Extension;
}

which can be called as:

// this code will never throw, not even when somePath becomes null
string somePath = GetDataFromElseWhereCanBeNull();
textBoxExtension.Text = somePath.GetExtension();

I know, this is only convenience and many people correctly consider it violation of OO principles (though the "founder" of OO, Bertrand Meyer, considers null evil and completely banished it from his OO design, which is applied to the Eiffel language, but that's another story). EDIT: Dan mentions that Bill Wagner (More Effective C#) considers it bad practice and he's right. Ever considered the IsNull extension method ;-) ?

To make your code more readable, another hint may be in place: use the null-coalescing operator more often to designate a default when an object is null:

// load settings
WriteSettings(currentUser.Settings ?? new Settings());

// example of some readonly property
public string DisplayName
{
    get 
    {
         return (currentUser ?? User.Guest).DisplayName
    }
}

None of these take the occasional check for null away (and ?? is nothing more then a hidden if-branch). I prefer as little null in my code as possible, simply because I believe it makes the code more readable. When my code gets cluttered with if-statements for null, I know there's something wrong in the design and I refactor. I suggest anybody to do the same, but I know that opinions vary wildly on the matter.

(Update) Comparison with exceptions

Not mentioned in the discussion so far is the similarity with exception handling. When you find yourself ubiquitously ignoring null whenever you consider it's in your way, it is basically the same as writing:

try 
{
    //...code here...
}
catch (Exception) {}

which has the effect of removing any trace of the exceptions only to find it raises unrelated exceptions much later in the code. Though I consider it good to avoid using null, as mentioned before in this thread, having null for exceptional cases is good. Just don't hide them in null-ignore-blocks, it will end up having the same effect as the catch-all-exceptions blocks.

Abel
The null-coalescing operator (in your example) hides the problem: why is Session["user"] `null` in the first place? Should it *really* be `null`? If so, then a `null` check is fine.In *More Effective C#*, Bill Wagner says not to use make extension methods work with `null` references as they're supposed to mimic actual methods on the class.
Dan
I have the same book and I know his opinion. And in general, I agree with that point of view. The "if you are serious" opening was meant partially ironically and using this approach, apart from rare exceptions, should not become a common practice. The example about the user was obviously simplified, perhaps it was ill-chosen but I hoped to illustrate how to use `??` to make code clearer when you must switch to a default object if null.
Abel
updated the examples, perhaps these are more illustrative for the use of the `??` operator.
Abel
+15  A: 

Passing non-null just to avoid a NullReferenceException is trading a straightforward, easy-to-solve problem ("it blows up because it's null") for a much more subtle, hard-to-debug problem ("something several calls down the stack is not behaving as expected because much earlier it got some object which has no meaningful information but isn't null").

NullReferenceException is a wonderful thing! It fails hard, loud, fast, and it's almost always quick and easy to identify and fix. It's my favorite exception, because I know when I see it, my task is only going to take about 2 minutes. Contrast this with a confusing QA or customer report trying to describe strange behavior that has to be reproduced and traced back to the origin. Yuck.

It all comes down to what you, as a method or piece of code, can reasonably infer about the code which called you. If you are handed a null reference, and you can reasonably infer what the caller might have meant by null (maybe an empty collection, for example?) then you should definitely just deal with the nulls. However, if you can't reasonably infer what to do with a null, or what the caller means by null (for example, the calling code is telling you to open a file and gives the location as null), you should throw an ArgumentNullException.

Maintaining proper coding practices like this at every "gateway" point - logical bounds of functionality in your code - NullReferenceExceptions should be much more rare.

Rex M
Changing a `null` for a default object just for fixing a possible error is never good. However, more often then not, returning `null` has a meaning, but `null` itself cannot express that meaning. Using empty or default objects are a fine solution when you need a meaning. Hard errors are nice, but are an error and should be considered as such. Coding around `null` is not an error, it means your design is wrong (or at least: can be improved), because `null` apparently just became meaningful.
Abel
"If you are handed a null object" -- suggest rephrasing this to keep clear the distinction between a null reference and a null object.
itowlson
@Abel I disagree that null simply means "without meaning". If we follow the principle of "being liberal in what we accept", then null can be an acceptable substitute for an empty sequence, for example. Not always, but sometimes.
Rex M
@Abel essentially I'm having a hard time with "no value" == "no meaning". Is it a design philosophy documented anywhere?
Rex M
I'll look up the reference for you, but with 600 chars I had to be brief. Imo, the meaning of `null` is the absence of data, which is a meaning on itself. However, in code, esp. with juniors, you find that `null` means "Guest user" or `null` means "localhost", which is tested and if'ed everywhere. That is what I mean with "a meaning", and that is something `null` shouldn't represent. `null` is "no data", "error", "no instance". That's a meaning, but a different one. Note that this is different then NULL in Win32 API or C++, where it often has an explicit meaning, much to the annoyance of many.
Abel
@Abel I definitely agree with that - null should not be a placeholder for some real value which is merely default.
Rex M
This isn't about swapping `null` for (something).Empty, it's about 1) getting rid of the `null` in the first place, and 2) "preventing" everybody from checking every stinkin' reference for `null`.
Dan
@Dan getting rid of null for the sake of getting rid of null, to prevent people from having to do as many null checks, is not a good goal to have.
Rex M
@Rex: OK, maybe I didn't say that quite right. The problem as I see it is that a crash (C++) or `NullReferenceException` (C#) occurs; the expedient thing I see done FAR TOO OFTEN is inserting "if (foo != null)" at the source of the crash to "fix" the problem.
Dan
@Dan: see my little comment on comparison with exception handling. You just answered your question: an `NRE` is an exception, and should be treated like that: an exceptional case. Catching exceptions should be done when you can and should recover (`FileNotFound`), otherwise, *graceful degradation* should be in place. You are absolutely right that `foo != null` is a terrible solution. Either `null` has a meaning (expected failure, behavior) or null has value (default, empty) and then you can deal with it. Otherwise, it is an exception and your app should treat it as such, instead of "fix" it.
Abel
+2  A: 

I tend to be dubious of code with lots of NULLs, and try to refactor them away where possible with exceptions, empty collections, and so on.

The "Introduce Null Object" pattern in Martin Fowler's Refactoring (page 260) may also be helpful. A Null Object responds to all the methods a real object would, but in a way that "does the right thing". So rather than always check an Order to see if order.getDiscountPolicy() is NULL, make sure the Order has a NullDiscountPolicy in these cases. This streamlines the control logic.

Jim Ferrans
Yes! Lots of `NULL` s (or `null` s) in code smells. The problem is, how to get rid of them--especially in C#.
Dan
+2  A: 

If you are writing code that returns null as an error condition, then don't: generally, you should throw an exception instead - far harder to miss.

If you are consuming code that you fear may return null, then mostly these are boneheaded exceptions: perhaps do some Debug.Assert checks at the caller to sense-check the output during development. You shouldn't really need vast numbers of null checks in you production, but if some 3rd party library returns lots of nulls unpredictably, then sure: do the checks.

In 4.0, you might want to look at code-contracts; this gives you much better control to say "this argument should never be passed in as null", "this function never returns null", etc - and have the system validate those claims during static analysis (i.e. when you build).

Marc Gravell
A: 

You can't always return an empty object, because 'empty' is not always defined. For example what does it mean for an int, float or bool to be empty?

Returning a NULL pointer is not necessarily a bad practice, but I think it's a better practice to return a (const) reference (where it makes sense to do so of course).

And recently I've often used a Fallible class:

Fallible<std::string> theName = obj.getName();
if (theName)
{
    // ...
}

There are various implementations available for such a class (check Google Code Search), I also created my own.

StackedCrooked
A: 

For the exception protagonists they usually stem from transactional programming and strong exception safety guarantees or blind guidelines. In any decent complexity, ie. async workflow, I/O and especially networking code they are simply inappropriate. The reason why you see Google style docs on the matter in C++, as well as all good async code 'not enforcing it' (think your favourite managed pools as well).

There is more to it and while it might look like a simplification, it really is that simple. For one you will get a lot of exceptions in something that wasn't designed for heavy exception use.. anyway I digress, read upon on this from the world's top library designers, the usual place is boost (just don't mix it up with the other camp in boost that loves exceptions, because they had to write music software :-).

In your instance, and this is not Fowler's expertise, an efficient 'empty object' idiom is only possible in C++ due to available casting mechanism (perhaps but certainly not always by means of dominance ).. On ther other hand, in your null type you are capable throwing exceptions and doing whatever you want while preserving the clean call site and structure of code.

In C# your choice can be a single instance of a type that is either good or malformed; as such it is capable of throwing acceptions or simply running as is. So it might or might not violate other contracts ( up to you what you think is better depending on the quality of code you're facing ).

In the end, it does clean up call sites, but don't forget you will face a clash with many libraries (and especially returns from containers/Dictionaries, end iterators spring to mind, and any other 'interfacing' code to the outside world ). Plus null-as-value checks are extremely optimised pieces of machine code, something to keep in mind but I will agree any day wild pointer usage without understanding constness, references and more is going to lead to different kind of mutability, aliasing and perf problems.

To add, there is no silver bullet, and crashing on null reference or using a null reference in managed space, or throwing and not handling an exception is an identical problem, despite what managed and exception world will try to sell you. Any decent environment offers a protection from those (heck you can install any filter on any OS you want, what else do you think VMs do), and there are so many other attack vectors that this one has been overhammered to pieces. Enter x86 verification from Google yet again, their own way of doing much faster and better 'IL', 'dynamic' friendly code etc..

Go with your instinct on this, weight the pros and cons and localise the effects.. in the future your compiler will optimise all that checking anyway, and far more efficiently than any runtime or compile-time human method (but not as easily for cross-module interaction).

rama-jka toti
A: 

How are empty objects better than null objects? You're just renaming the symptom. The problem is that the contracts for your functions are too loosely defined "this function might return something useful, or it might return a dummy value" (where the dummy value might be null, an "empty object", or a magic constant like -1. But no matter how you express this dummy value, callers still have to check for it before they use the return value.

If you want to clean up your code, the solution should be to narrow down the function so that it doesn't return a dummy value in the first place.

If you have a function which might return a value, or might return nothing, then pointers are a common (and valid) way to express this. But often, your code can be refactored so that this uncertainty is removed. If you can guarantee that a function returns something meaningful, then callers can rely on it returning something meaningful, and then they don't have to check the return value.

jalf