views:

226

answers:

11

In previous large-scale applications requiring high robustness and long up-times, I've always been for validating a pointer function argument when it was documented as "must never be NULL". I'd then throw an std::invalid_argument exception, or similar, if the argument actually was NULL in C++ and return an error code in C.

However, I'm starting to think that maybe it's better to just let the application blow up immediately at the first NULL pointer dereference in that same function - then the crash dump file would reveal what happened - and let a thorough testing process find the bad function calls.

One problem with not checking for NULL and letting the application blow up, is that if the pointer isn't actually dereferenced in that function, but rather stored for later use, then the dereferencing blow-up will be out of context and much harder to diagnose.

Any thoughts or best practices on this out there?

Edit 1 : I forgot to mention that much of our code are libraries for 3rd party developers that may or may not know about our internal error handling policies. But the functions are still documented properly!

+1  A: 

If your application does logging, you could simply write to log file something like: functionX(): got NULL pointer with timestamps or any other relevant context. As for the decision whether to let it blow up or do the check and throw an error... Well that really depends on what your application does, what are your priorities and so on. I myself would recommend the combination of checks/exception throwing and logs. Analyzing crash dumps seems a waste of time here.

PeterK
+10  A: 

My personal preference is to document the fact that the function cannot take a NULL pointer, and leave it at that. Note though that there is no guarantee that dereferencing a NULL pointer will cause a crash - if you insist on a diagnostic, throwing an exception is the only portable solution.

Having said that, I find that unwanted NULL pointers are very, very rare in my own code. If you have a lot of them, it suggests problems elsewhere, probably in the code's basic design.

anon
@Neil: I agree, our own code usually don't fail in our test processes, but much of our code are libraries for 3rd party developers, with everything that might come with that fact... (added an **Edit** with this addition to the question)
Johann Gerell
+1. Unwanted null pointers are usually the least frequent problems.
Alexandre C.
+1: for NULL Pointer creep == Design Error.
Luther Blissett
on `gcc` you can add an `__attribute__ ((nonnull (1, 3)))` to tell that parameter 1 and 3 can't be `NULL`. It allows static check. In practice it's more of a documenting feature.
tristopia
+5  A: 

Definitely throw a C++ exception - it can be logged and diagnosed much easier than your program blowing up later.

Consider yourself in the following situation. Your program works for ten days and then faces that null pointer. If it blows up on the eleventh day you will have to figure out that the problem is relevant to that null pointer the day before. But if you throw an exception and its text is logged you can just look into the log ad start working from there. In other words, you will know for sure that the problem is that null pointer. Just think how different it is when the program is at the customer site, not in your comfortable debug environment.

sharptooth
The 11 days is a red herring. If it blows up because of NULL pointer dereference you have to look how that variable was filled, if it's not obvious then there's something other wrong in your program. This said, if that pointer is so important then it is good practice to check correctly (with or without exception) when it is filled.
tristopia
+2  A: 

If the function parameter is a pointer that should not be NULL then shouldn't you change the parameter to be a reference? This probably does not change the reporting of when you do get cast from a NULL pointer but would allow static type checkers more chance to spot the error at compilation.

Mark
@Mark: Excellent point, in the C++ case.
Johann Gerell
References are just syntactic sugar; they don't protect against passing invalid pointers (such as the NULL pointer) and they make it more awkward to check against NULL...
R..
For certain problems you can design without poters and thus never have NULLs so no need for checks - also converting a NULL gives an exception automatically so no extra work needed - agreed though not the solution in all cases but still usefulin many cases
Mark
If you accept a reference, you technically can't check against NULL, since your program is already exhibiting undefined behavior, so the test might still pass. More importantly, it began doing so *before* entering your code. At some point, the user of a library has to take responsibility for their own testing.
Dennis Zickefoose
@R..: But accepting a reference instead of a pointer makes the burden of possible validations **inside** the function smaller. If a bad reference is passed, then there's nothing we **can** do about it.
Johann Gerell
R..
@R won't the program already have thrown an exception as you pass to the function?
Mark
@Mark: maybe, maybe not. That's the problem with undefined behavior. Compilers might not insert code to actually touch the memory address when creating the reference, and accessing a null pointer might not result in a crash even if it does. And at the extreme "UB can trigger the apocolypse" case, even if you successfully make it into the function `if(!0)` could very well evaluate as `false`.
Dennis Zickefoose
Uhg. I meant to say "syntactic lemon juice"...
R..
+2  A: 

If the first thing you do is check that the parameter isn't NULL, then you should be using a reference.

void Foo(Bar *const pBar)
{
    if (pBar == NULL) { throw error(); }

    // Now do something with pBar
}

Change that to:

void Foo(Bar &bar)
{
    // Now use bar, safe in the knowledge it's not NULL.
}

Be aware though that you can still get a NULL reference:

Bar *const pBar = NULL;
Foo(*pBar); // Will crash *inside* of Foo, rather than at the point of dereference.

Which means you need to put your parameter checking further up.

Mark Ingram
@Mark: Excellent point, in the C++ case.
Johann Gerell
+1  A: 

I came also to the conclusion that it is better to let crash the application when the rotten pointer is accessed than checking parameters on all functions. At the interface level in public land, API level, where applications call libraries, there have to be checks and proper error handling (with diagnostics eventually), but from there down, no redundand checks necessary. Look at the standard lib, it uses exactly that philosophy, if you call strlen or memset or qsort with a NULL you get a well deserved crash. It makes sense that way, as in 99.9% of cases you will provide good pointers and the check in that function will be redundant.

I even tend to not check allocations anymore, in most cases if the allocation fails it's already too late to do something about it and a crash of the application is better than silently (or even noisily) generate crap results. The people who follow production detect crashed execution immediatly and can diagnose or rerun the job without hassle, another error message in the error logs instead is easily overlooked and can corrupt results for days.

On our project which I share with a collegue who is more using the opposite approach, there are less bugs in my code as the functional code is not diluted in a sea of unecessary checks. From the 10 last bugs we had in production where 9 were from his part of code and the crash in my code was fixed in 10 minutes, the stack trace showed directly in what function I made the false assumption on the parameters.

tristopia
+2  A: 

However, I'm starting to think that maybe it's better to just let the application blow up immediately at the first NULL pointer dereference in that same function - then the crash dump file would reveal what happened - and let a thorough testing process find the bad function calls.

If you're ever considering this, then you should consider using assert() instead. That way app is guaranteed to properly crash and generate the core dump.

Throwing exception which isn't going to be caught would unwind the stack and diagnosing the (internal) problem later might be more complicated - unless of course you would be inside of every function catching the exception, enriching the error message with proper information to allow later localization of the problem. But if you do already handle exceptions to such extent, then it makes no sense to treat the error as fatal.

For the cases like this I personally prefer to use assert() accompanied by the exhaustive unit test. Normal high level tests rarely can cover the code sufficiently. For the release builds I disable assert()s (using -DNDEBUG) as in production, on such rare occasion, customers generally prefer module which might be working with a flaw, yet is working somehow and not crashing. (The sad reality of commercial software.)

N.B. If code is performance sensitive, then assert might be better choice. You can disable asserts in release builds - but you can't remove exception handling.

To summarize. If NULL really really can't happen, use assert(). If you have already exception handling in place, then treat the NULL as a plain internal error.

P.S. Another concept regarding invalid argument handling (rarely implementable, but I have seen some real examples in past) is to refactor the code in a manner which would exclude the possibility of invalid argument happening. Life support/mission critical/carrier grade software employ such design tricks to decrease testing overhead.

Dummy00001
... working with a flaw will generate rubbish output. I have a feeling that the customer will prefer an obvious error over a hidden one, but that the developer too shy to admit his fault might prefer the latter.
xtofl
I disaggree with the whole 'work with a flaw', I agree with the `assert` approach: fail fast! Asserts are cheap both in developer time and runtime and help catch errors. +1
David Rodríguez - dribeas
Adding the assert also documents that you do not expect the pointer to be null.
doron
@xtofl: I wouldn't believed that myself, yet it is sad true for some markets. Just a week ago, customer refused a fix for a memory leak (leading to a minor data corruption) and requested full explanation of side-effects and a workaround instead. At the moment they do not have anybody on-site who can take responsibility of deploying the patch into business critical system - but they have admins who can restart the programs and clean up the rubbish from the DB. And that is not an unique occurrence.
Dummy00001
@Dummy00001: I often encounter 'fallback' code that returns e.g. 0.0 for a calculation if some argument is not usable.I agree, if you're able to recognize the rubbish. But that means that you have to generate recognizable rubbish in the fail-scenario, which boils down, in this case, to another null-check and decent error handling code.The `assert` is certainly a great help in debug builds, but doesn't make the code robust.
xtofl
@xtofl: "assert [...] doesn't make the code robust" - and I haven't claimed that. Purpose of asserts (and other similar techniques) is to allow one to diagnose the problems occuring at customer site faster. One might have a function which looks perfectly fine but once a month it throws an error. Adding an error message (now when you see that there might be a problem) is too late: wait another month for the problem to reoccur. A core dump from assert() is something one can start analyzing immediately.
Dummy00001
@deus-ex-machina399: "Adding the assert also documents that you do not expect the pointer to be null" - Yes, but only for the ones with source code access, not at API level.
Johann Gerell
+2  A: 

So the real question is: When writing a C++/C library what are the possible issues to consider when specifying an exception / error handling strategy in the specific case for choosing throwing an exception (C++ library) or returning an error code (C library) versus letting it all blow up when a function's arguments do not follow the documented constraints?

Design by contract anyone? I think this is what Neil Butterworth is describing in part. If the specification for a function says that if the argument values are within a specified constraints (preconditions) then the output will be within the specified constraints (postconditions) and there will be no violation of any specified invariants. If the user of your library function violates the documented preconditions then it's open season for the 'undefined behaviour' so beloved of C++ people. This too must be documented. You may specify that the function will catch and handle 'invalid' argument values (C++ exceptions / error codes) but then it begs the question are they still invalid arguments as the response of the function to this specified precondition then should become a specified postcondition.

This is all related to the publicly released code / binaries.

For debugging purposes you may wish to use asserts etc to catch invalid arguments but this is really a test of the client code conforming to the preconditions. The client code could be the library users code or it maybe library code or both. As the library writer you will have to reason through the possible issues for the varying scenarios / use cases with respect to specifications, design and implementation issues.

Sam
+3  A: 

My preference is not to check for NULL pointers. Checking against NULL is merely checking against one of the billions of possible invalid argument values; the rest are undetectable and potentially much more dangerous. Unless your function's documentation states that NULL is a valid argument with special meaning, anyone calling it should assume all hell will break loose if they pass NULL.

If your function is very short and used in tight loops, checks against NULL can be a significant performance loss. One very real example is the standard C mbrtowc function used for decoding multibyte characters. If your application needs to do byte-by-byte ot character-by-character decoding, a dedicated UTF-8 decoder can be 20% faster than the optimal UTF-8 mbrtowc implementation simply because of the fact that the latter is required to do a bunch of useless checks at the beginning, mainly NULL pointer checks, and it's called many many times on very small data.

Even if most of your functions work on large data, it sets a bad precedent to tell the caller it's okay to pass NULL pointers. What if you later need functions that work on small fragments?

Unless you're coding for embedded systems, NULL is the least-dangerous invalid pointer to dereference, since it will immediately result in an operating-system-level exception (signal/etc.). If you don't want to rely on that, why not just use assert(ptr); so you can enable/disable the checks easily for debugging?

R..
Please note in the mbrtowc example I'm not talking about other inefficiencies that exist in real-world implementations. glibc's mbrtowc is about 20 times slower than an optimal dedicated decoder. I'm purely talking about the useless special cases required by the standard.
R..
+1 - It's always harder to debug the "valid looking" but nonetheless still invalid pointer...
bstpierre
+1  A: 

It's an easy call. Getting a NULL pointer when none is expected is a clear sign of a bug in the program or a severely compromised program state. Either of which is going to require the programmer to do something about it. Throwing an exception will only ever work out well if it isn't caught. If it is caught you lose very important diagnostic information, it throws away the all-important call stack that helps you find out how it got into that state. If it is not caught then there's no appreciable difference between the C++ exception and the hardware exception.

The risk that the exception might be caught somewhere demands that you don't throw an exception and just let it die.

This consideration is very different for a runtime environment that can produce the call stack after the exception is caught. Common in managed environments, not in native C/C++.

Hans Passant
@Hans: Superb. So far, this answer summarizes my unstructured thoughts on this matter perfectly. This is the very essence of why I'm starting to think it's better to not throw and instead just die. The call stack mostly shows a good enough trace of where and why things became as bad as they did.
Johann Gerell
+1  A: 

Because of your edit I see that you're describing a published API. Because of this you have to think of your users.

Which is better when they are debugging their code:

  • A null pointer dereference with a stack trace 3 or more levels deep in your library code.
  • An exception from only one level into the library that says "you did this wrong".

I think the answer is pretty plain, especially when you consider the bug reports for the first case that will be a waste of your time.

Darron