views:

469

answers:

6

Disclaimer: I am a layperson currently learning to program. Never been part of a project, nor written anything longer than ~500 lines.

My question is: does defensive programming violate the Don't Repeat Yourself principle? Assuming my definition of defensive programming is correct (having the calling function validate input instead of the opposite), wouldn't that be detrimental to your code?

For instance, is this bad:

int foo(int bar)
{
    if (bar != /*condition*/)
    {
        //code, assert, return, etc.
    }
}

int main()
{
    int input = 10;
    foo(input); //doesn't the extra logic
    foo(input); //and potentially extra calls
    foo(input); //work against you?
}

compared to this:

int main()
{
    if (input == /*condition*/)
    {
        foo(input);
        foo(input);
        foo(input);
    }
}

Again, as a layperson, I don't know how much simple logic statements count against you as far as performance goes, but surely defensive programming is not good for the program or the soul.

+3  A: 

Let me state first that blindly following a principle is idealistic and WRONG. You need to achieve what you want to achieve (say, safety of your application), which is usually far more important that violating DRY. Willful violations of principles are most often necessary in GOOD programming.

An example: I do double checks at important stages (e.g. LoginService - first validate input once before calling LoginService.Login, and then again inside), but sometimes I tend to remove the outer one again later after I made sure that everything works 100%, usually using unit tests. It depends.

I'd never get worked up over double condition checks though. FORGETTING them entirely on the other hand is usually multiple magnitudes worse :)

Alex
+6  A: 

It all comes down to the contract the interface provides. There are two different scenarios for this: inputs and outputs.

Inputs--and by that I basically mean parameters to functions--should be checked by the implementation as a general rule.

Outputs--being return results--should be basically trusted by the caller, at least in my opinion.

All of this is tempered by this question: what happens if one party breaks the contract? For example, lets say you had an interface:

class A {
  public:
    const char *get_stuff();
}

and that contract specifies that a null string will never be returned (it'll be an empty string at worst) then it's safe to do this:

A a = ...
char buf[1000];
strcpy(buf, a.get_stuff());

Why? Well, if you're wrong and the callee returns a null then the program will crash. That's actually OK. If some object violates its contract then generally speaking the result should be catastrophic.

The risk you face in being overly defensive is that you write lots of unnecessary code (which can introduce more bugs) or that you might actually mask a serious problem by swallowing an exception that you really shouldn't.

Of course circumstances can change this.

cletus
+1 It all comes down to the contract the interface provides. Defensive program != masking problems, defensive program == exposing problems by throwing exceptions if contracts are violated.
TimW
+1  A: 

In your simplified example, yes, the second format is probably preferable.

However, that doesn't really apply to larger, more complex, and more realistic programs.

Because you never know in advance where or how "foo" will be used, you need to protect foo by validating the input. If the input is validated by the caller (eg. "main" in your example) then "main" needs to know the validation rules, and apply them.

In real-world programming, the input validation rules can be fairly complex. It is not appropriate to make the caller know all the validation rules and apply them properly. Some caller, somewhere, is going to forget the validation rules, or do the wrong ones. So its better to put the validation inside "foo", even if it will get called repeatedly. This shifts the burden from the caller to the callee, which frees the caller to think less about the details of "foo", and use it more as an abstract, reliable interface.

If you truly have a pattern where "foo" will get called multiple times with the same input, I suggest a wrapper function that does the validation once, and an unprotected version that side-steps the validation:

void RepeatFoo(int bar, int repeatCount)
{
   /* Validate bar */
   if (bar != /*condition*/)
   {
       //code, assert, return, etc.
   }

   for(int i=0; i<repeatCount; ++i)
   {
       UnprotectedFoo(bar);
   }
}

void UnprotectedFoo(int bar)
{
    /* Note: no validation */

    /* do something with bar */
}

void Foo(int bar)
{
   /* Validate bar */
   /* either do the work, or call UnprotectedFoo */
}
abelenky
Better to expose BarIsValid() and let the caller write his own loop. Then you have a minimal API: BarIsValid() and Foo(), and won't need to extend the API when a user wants a more complicated loop.PS: I think RepeatBar shouldn't validate bar when repeatCount is 0.
Paul Hankin
+9  A: 

Violating the DRY principle looks like that:

int foo(int bar)
{
    if (bar != /*condition*/)
    {
        //code, assert, return, etc.
    }
}

int main()
{
    int input = 10;
    if (input == /*condition*/)
    {
       foo(input);
       foo(input);
       foo(input);
    }
}

as you can see, the problem is that we have the same check twice in the program, so if the condition changes, we must modify it at two places, and chances are that we forget one of them, causing strange behaviour. DRY doesn't mean "don't execute the same code twice", but "don't write the same code twice"

ammoQ
I think that what Hooked wanted to say was not checking it 2 times but only in main() function.
Artur Soler
A: 

Like Alex said, it depends on the situation, for example, I almost always validate input at every stage of the log in process.

In other places, you don't need all that.

However, in the example you gave, I'm assuming, in the second example, that you have more than one input, 'cause otherwise it'll be redundant calling the same function 3 times for the same input which means you'll have to write the condition 3 times. Now THAT is redundant.

If the input ALWAYS has to be checked just include it in the function.

Leo Jweda
+1  A: 

I think defensive programming gets kind of a bad rap, since it does some things that are kind of undesirable, which include wordy code, and more significantly, papering over errors.

Most folks seem to agree that a program should fail fast when it encounters an error, but that mission critical systems should preferably never fail, and instead go to great lengths to keep on going in the face of error states.

There's a problem with that statement, of course, How can a program, even mission critical, continue when it's in an inconsistent state. Of course it can't, really.

What you want is for the program to take every reasonable step to do the right thing, even if there's something odd going on. At the same time, the program should complain, loudly, every time it encounters such an odd state. And in the event it encounters an error that is unrecoverable, it should usually avoid issuing a HLT instruction, rather it should fail gracefully, shutting down its systems safely or activating some backup system if one is available.

TokenMacGuy
I completely agree. I think defensive logic has to be put in its proper place to be useful, just like anything else. Using it everywhere (say encapsulated private members) seems overkill for me and really can hamper the ability to write succinct and elegant code.
Chris Nicola