views:

244

answers:

6

I just got my copy of Code Complete by Steve McConnell, and there's one area I'm a bit confused about. On page 51, he says:

Robustness is the ability of a system to continue to run after it detects an error. Often an architecture specifies a more robust system than that specified by the requirements. One reason is that a system composed of many parts that are minimally robust might be less robust than is required overall. In software, the chain isn't as strong as its weakest link; it's as weak as all the weak links multiplied together. The architecture should clearly indicate whether programmers should err on the side of overengineering or on the side of doing the simplest thing that works.

(note that the text above should be covered under fair use and thus not break any copyrights)

I'm a bit confused as to what McConnell means here as he doesn't ever elaborate on the subject (as far as I can tell). Is he trying to say that overengineering is good in the context of handling errors or is he saying something else?

Steve McConnell (2004). Code Complete. Redmond: Microsoft Press.

+4  A: 

If you're writing a batch file to clean up temp files after a build, you can probably skimp on error handling.

If you're writing software to be burned into ROM on a satellite, you should go nuts with the error-handling.

If you're writing a library that will be used for years to come by programmers who won't be willing or able to contact you when something blows up, then you should probably aim for a happy medium - don't try to work around every possible way the client software could break you, but at least work to provide a level of safety in invalid but plausible usage scenarios.

Shog9
+4  A: 

What he's saying is very simple: one part that fails 1 time in a million, when put together with another part that fails 1 time in a million, will not fail 1 time in a million. They will fail more frequently (how much more frequently is dependent on their codependency, but at the least, it's going to be 2 times in a million). So the overall reliability of the code depends on the underlying code being MORE dependable than the overall failure rate.

"For want of a nail the shoe was lost. For want of a shoe the horse was lost. For want of a horse the rider was lost. For want of a rider the battle was lost. For want of a battle the kingdom was lost. And all for the want of a horseshoe nail."

McWafflestix
Actually it should be at least 1.999999 in a million if we assume that the two codes are independent.
blizpasta
And if we assume they may be codependent, perhaps in combination the failure rate is 0, because one fixes the other's errors. But this has NEVER HAPPENED ;-)
Steve Jessop
+1  A: 

I think that the issue is whether or not a component should be pessimistic or optimistic with respect to handling errors from other components that they rely on. Over-engineering in this context would be to have both the calling and the called components handle errors. In an extreme case, each method could wrap it's code in a a try/catch block -- just in case the components that it calls happens to thrown an exception -- rather than letting exceptions percolate up to the appropriate layer. This would be pessimistic usage. Optimistic usage of components would be to expect that methods succeed and not bother to handle potential exceptions. This essentially defaults all exception handling up a level in the application. A middle ground would be to handle those errors that are reasonably able to be handled at each level and punt those errors that can't be handled up to a higher level.

tvanfosson
+2  A: 

It is the challenge of the software architect role to figure out the ideal level of engineering that needs to be in place for each component of a system.

As I read it, McConnell is saying that the authors of the architecture need to be sure to clearly specify the areas, if any, of the specific system that need to be "industrial strength" and which do not.

Guy Starbuck
-1: In McConnel's context, `overegineering` may not only be desirable, but *required*. Never speak in absolutes. ;-)
Ken Gentle
Good point, although I think the term "overengineering" implies that something is overly architected, you are right about McConnell's context in the passage. I have edited my response to remove the "absolute" statement.
Guy Starbuck
+3  A: 

i think part of the confusion is the use of the term "overengineering", which implies that more was being done than was necessary. In the context of the quote the term is used to denote an upper limit of over-cautiousness, in effect saying that the specification should tell the programmers how paranoid/cautious/defensive they should be.

Steven A. Lowe
+3  A: 

He's saying sometimes you should "err on the side of over-engineering".

This doesn't mean over-engineering itself is good. By definition, it isn't. He means that sometimes it's worth risking in order to avoid an error in the opposite direction ("not engineered enough", I suppose). Other times, it's better to make an error in the direction of too simple, then correct it when your tests fail.

At least, that's what I understand by "err on the side of X".

You'd hope that he'd then go on to explain in more detail how to tell which one is the case, so as to provide the documentation he advocates. But it's a start.

If you had your way, you'd always do the exact right amount of engineering. But it's not possible to determine how much that is, so you take a chance. It's still somewhat bad if you miss, it's just less bad than missing by more. Or hitting yourself in the foot.

Steve Jessop