views:

642

answers:

14

Possible Duplicate:
Defensive programming

We had a great discussion this morning about the subject of defensive programming. We had a code review where a pointer was passed in and was not checked if it was valid.

Some people felt that only a check for null pointer was needed. I questioned whether it could be checked at a higher level, rather than every method it is passed through, and that checking for null was a very limited check if the object at the other end of the point did not meet certain requirements.

I understand and agree that a check for null is better than nothing, but it feels to me that checking only for null provides a false sense of security since it is limited in scope. If you want to ensure that the pointer is usable, check for more than the null.

What are your experiences on the subject? How do you write defenses in to your code for parameters that are passed to subordinate methods?

+3  A: 

In all serious, it depends on how many bugs you'd like to have to have inflicted upon you.

Checking for a null pointer is definitely something that I would consider necessary but not sufficient. There are plenty of other solid principles you can use starting with entry points of your code (e.g., input validation = does that pointer point to something useful) and exit points (e.g., you thought the pointer pointed to something useful but it happened to cause your code to throw an exception).

In short, if you assume that everyone calling your code is going to do their best to ruin your life, you'll probably find a lot of the worst culprits.

EDIT for clarity: some other answers are talking about unit tests. I firmly believe that test code is sometimes more valuable than the code that it's testing (depending on who's measuring the value). That said, I also think that units tests are also necessary but not sufficient for defensive coding.

Concrete example: consider a 3rd party search method that is documented to return a collection of values that match your request. Unfortunately, what wasn't clear in the documentation for that method is that the original developer decided that it would be better to return a null rather than an empty collection if nothing matched your request.

So now, you call your defensive and well unit-tested method thinking (that is sadly lacking an internal null pointer check) and boom! NullPointerException that, without an internal check, you have no way of dealing with:

defensiveMethod(thirdPartySearch("Nothing matches me")); 
// You just passed a null to your own code.
Bob Cross
+2  A: 

"Unit tests verifying the code does what it should do" > "production code trying to verify its not doing what its not supposed to do".

I wouldn't even check for null myself, unless its part of a published API.

Frank Schwieterman
+1  A: 

It very much depends; is the method in question ever called by code external to your group, or is it an internal method?

For internal methods, you can test enough to make this a moot point, and if you're building code where the goal is highest possible performance, you might not want to spend the time on checking inputs you're pretty darn sure are right.

For externally visible methods - if you have any - you should always double check your inputs. Always.

Dean J
A: 

For internal methods, we usually stick to asserts for these kinds of checks. That does get errors picked up in unit tests (you have good test coverage, right?) or at least in integration tests that are running with assertions on.

xcut
+14  A: 

In Code Complete 2, in the chapter on error handling, I was introduced to the idea of barricades. In essence, a barricade is code which rigorously validates all input coming into it. Code inside the barricade can assume that any invalid input has already been dealt with, and that the inputs that are received are good. Inside the barricade, code only needs to worry about invalid data passed to it by other code within the barricade. Asserting conditions and judicious unit testing can increase your confidence in the barricaded code. In this way, you program very defensively at the barricade, but less so inside the barricade. Another way to think about it is that at the barricade, you always handle errors correctly, and inside the barricade you merely assert conditions in your debug build.

As far as using raw pointers goes, usually the best you can do is assert that the pointer is not null. If you know what is supposed to be in that memory then you could ensure that the contents are consistent in some way. This begs the question of why that memory is not wrapped up in an object which can verify it's consistency itself.

So, why are you using a raw pointer in this case? Would it be better to use a reference or a smart pointer? Does the pointer contain numeric data, and if so, would it be better to wrap it up in an object which managed the lifecycle of that pointer?

Answering these questions can help you find a way to be more defensive, in that you'll end up with a design that is easier to defend.

mch
Don't forget that you should be extremely defensive around things outside of your control, i.e. "Do I expect I will run out of memory?" or "Will I run out of threads?" Other than that, +1.
wheaties
I agree with @wheaties, but I'd like to add that "outside your control" can also mean the client code if you're developing an API. Anything exposed through the API should have tons of defensive code around it (perhaps only in debug builds though, for performance).
rmeador
Smart pointers are great and all... but they do nothing to prevent null pointer errors. You can have a null-valued smart pointer just as easily as a null-valued raw pointer. Easier, in fact, since the raw pointer might be uninitialized instead of null.
Alan
A: 

checking for null pointer is only half of the story, you should also assign a null value to every unassigned pointer.
most responsible API will do the same.
checking for a null pointer comes very cheap in CPU cycles, having an application crashing once its delivered can cost you and your company in money and reputation.

you can skip null pointer checks if the code is in a private interface you have complete control of and/or you check for null by running a unit test or some debug build test (e.g. assert)

Alon
If the null pointer input is due to a programmer error elsewhere, then finding it does not prevent the application from crashing, and costing you money and reputation. If the caller can pass you an input which your API does not permit, then they can ignore the error return value and end up crashing anyway. All it does, is make it quicker and easier to diagnose errors when they do occur (mitigation).
Steve Jessop
it goes without saying that finding the problem must be followed by some corrective action, I was pointing out that finding (and correcting) an error become more expensive as you approach the release date, it can sky-rocket once you go past it.
Alon
+4  A: 

There is no way to check if a pointer is valid.

Nemanja Trifunovic
Depending on your scenario, you may very well be able to "check if a pointer is valid". If you created the object, you can register it in a table of "known good pointers". You can then check if the pointer is in the table before using it.If you didn't create it, then your answer is correct.
Merlyn Morgan-Graham
+13  A: 

The best way to be defensive is not to check pointers for null at runtime, but to avoid using pointers that may be null to begin with

If the object being passed in must not be null, use a reference! Or pass it by value! Or use a smart pointer of some sort.

The best way to do defensive programming is to catch your errors at compile-time. If it is considered an error for an object to be null or point to garbage, then you should make those things compile errors.

Ultimately, you have no way of knowing if a pointer points to a valid object. So rather than checking for one specific corner case (which is far less common than the really dangerous ones, pointers pointing to invalid objects), make the error impossible by using a data type that guarantees validity.

I can't think of another mainstream language that allows you to catch as many errors at compile-time as C++ does. use that capability.

jalf
Haskell/Eiffel are quite good in this regard, don't know if they can be considered mainstream though :x
Matthieu M.
Yeah, I thought of Haskell too when I wrote it (I don't know Eiffel though), but yeah, I didn't really consider it mainstream. (in the sense that it's pretty common to come across an application written in C++, but I'd be surprised to see a program used "in the wild" that was written in Haskell. They exist, no doubt about that, they're just not very common.
jalf
A: 

There are a few things at work here in this question which I would like to address:

  1. Coding guidelines should specify that you either deal with a reference or a value directly instead of using pointers. By definition, pointers are value types that just hold an address in memory -- validity of a pointer is platform specific and means many things (range of addressable memory, platform, etc.)
  2. If you find yourself ever needing a pointer for any reason (like for dynamically generated and polymorphic objects) consider using smart pointers. Smart pointers give you many advantages with the semantics of "normal" pointers.
  3. If a type for instance has an "invalid" state then the type itself should provide for this. More specifically, you can implement the NullObject pattern that specifies how an "ill-defined" or "un-initialized" object behaves (maybe by throwing exceptions or by providing no-op member functions).

You can create a smart pointer that does the NullObject default that looks like this:

template <class Type, class NullTypeDefault>
struct possibly_null_ptr {
  possibly_null_ptr() : p(new NullTypeDefault) {}
  possibly_null_ptr(Type* p_) : p(p_) {}
  Type * operator->() { return p.get(); }
  ~possibly_null_ptr() {}
  private:
    shared_ptr<Type> p;
    friend template<class T, class N> Type & operator*(possibly_null_ptr<T,N>&);
};

template <class Type, class NullTypeDefault>
Type & operator*(possibly_null_ptr<Type,NullTypeDefault> & p) {
  return *p.p;
}

Then use the possibly_null_ptr<> template in cases where you support possibly null pointers to types that have a default derived "null behavior". This makes it explicit in the design that there is an acceptable behavior for "null objects", and this makes your defensive practice documented in the code -- and more concrete -- than a general guideline or practice.

Dean Michael
That is in essence the `boost::optional<T>` that Pavel mentionned in the comments of the question.
Matthieu M.
But `optional<T>` doesn't work "better" than a pointer because you still need to check whether it is set (just like a normal pointer). What I proposed was making the "Null Object" behavior part of the design and using the `possibly_null_ptr<>` just like you would a normal pointer, only with default "null" characteristics.
Dean Michael
+1  A: 

From debugging point of view, it is most important that your code is fail-fast. The earlier the code fails, the easier to find the point of failure.

Martin
+2  A: 

I'm a big fan of the "let it crash" school of design. (Disclaimer: I don't work on medical equipment, avionics, or nuclear power-related software.) If your program blows up, you fire up the debugger and figure out why. In contrast, if your program keeps running after illegal parameters have been detected, by the time it crashes you'll probably have no idea what went wrong.

Good code consists of many small functions/methods, and adding a dozen lines of parameter-checking to every one of those snippets of code makes it harder to read and harder to maintain. Keep it simple.

Kristopher Johnson
I actually disagree, by the time your code crashes, it may take a while to diagnose which method actually introduced the buggy state. Especially when it's passed around alot before someone actually try to use it... I prefer the Fail Fast principle for debug build, which requires to test... however, I tend to test the MINIMAL for what I intend to do with something.
Matthieu M.
+2  A: 

I may be a bit extreme, but I don't like Defensive Programming, I think it's laziness that has introduced the principle.

For this particular example, there is no sense in assert that the pointer is not null. If you want a null pointer, there is no better way to actually enforce it (and document it clearly at the same time) than to use a reference instead. And it's documentation that will actually be enforced by the compiler and does not cost a ziltch at runtime!!

In general, I tend not to use 'raw' types directly. Let's illustrate:

void myFunction(std::string const& foo, std::string const& bar);

What are the possible values of foo and bar ? Well that's pretty much limited only by what a std::string may contain... which is pretty vague.

On the other hand:

void myFunction(Foo const& foo, Bar const& bar);

is much better!

  • if people mistakenly reverse the order of the arguments, it's detected by the compiler
  • each class is solely responsible for checking that the value is right, the users are not burdenned.

I have a tendency to favor Strong Typing. If I have an entry that should be composed only of alphabetical characters and be up to 12 characters, I'd rather create a small class wrapping a std::string, with a simple validate method used internally to check the assignments, and pass that class around instead. This way I know that if I test the validation routine ONCE, I don't have to actually worry about all the paths through which that value can get to me > it will be validated when it reaches me.

Of course, that doesn't me that the code should not be tested. It's just that I favor strong encapsulation, and validation of an input is part of knowledge encapsulation in my opinion.

And as no rule can come without an exception... exposed interface is necessarily bloated with validation code, because you never know what might come upon you. However with self-validating objects in your BOM it's quite transparent in general.

Matthieu M.
A: 

Pointer should only be used if do you need to do something with the pointer. Such as pointer arithmetic to transverse some data structure. Then if possible that should be encapsulated in a class.

IF the pointer is passed into the function to do something with the object to which it points, then pass in a reference instead.

One method for defensive programming is to assert almost everything that you can. At the beginning of the project it is annoying but later it is a good adjunct to unit testing.

dlb
A: 

A number of answer address the question of how to write defenses in your code, but no much was said about "how defensive should you be?". That's something you have to evaluate based on the criticality of your software components.

We're doing flight software and the impacts of a software error range from a minor annoyance to loss of aircraft/crew. We categorize different pieces of software based on their potential adverse impacts which affects coding standards, testing, etc. You need to evaluate how your software will be used and the impacts of errors and set what level of defensiveness you want (and can afford). The DO-178B standard calls this "Design Assurance Level".

progrmr