views:

270

answers:

7

This question is inspired by this question, which features the following code snippet.

int s;
if((s = foo()) == ERROR)
    print_error();

I find this style hard to read and prone to error (as the original question demonstrates -- it was prompted by missing parentheses around the assignment). I would instead write the following, which is actually shorter in terms of characters.

int s = foo();
if(s == ERROR)
    print_error();

This is not the first time I've seen this idiom though, and I'm guessing there are reasons (perhaps historical) for it being so often used. What are those reasons?

+14  A: 

I think it's for hysterical reasons, that early compilers were not so smart at optimizing. By putting it on one line as a single expression, it gives the compiler a hint that the same value fetched from foo() can be tested rather than specifically loading the value from s.

I prefer the clarity of your second example, with the assignment and test done later. A modern compiler will have no trouble optimizing this into registers, avoiding unnecessary loads from memory store.

mdma
+1 for "hysterical reasons" ;-)
Péter Török
+1 I'll me too that.
JeremyP
A lot of people thing that "the shorter my code is, the less code will be generated, and the faster it'll execute". Often, a better rule of thumb is that "the more readable my code is, the better the compiler will be able to understand it, and the faster it'll execute". But try convincing people of that one. ;)
jalf
"hysterical reasons" Either a typo, or something that is brilliantly witty.
msemack
+3  A: 

I make a conscious attempt at combining the two whenever possible. The "penalty" in size isn't enough to overcome the advantage in clarity, IMO.

The advantage in clarity comes from one fact: for a function like this, you should always think of calling the function and testing the return value as a single action that cannot be broken into two parts ("atomic", if you will). You should never call such a function without immediately testing its return value.

Separating the two (at all) leads to a much greater likelihood that you'll sometimes skip checking the return value completely. Other times, you'll accidentally insert some code between the call and the test of the return value that actually depends on that function having succeeded. If you always combine it all into a single statement, it (nearly) eliminates any possibility of falling into these traps.

Jerry Coffin
Interesting point of view. I agree that it is advantageous to think about the assignment/error check as an atomic unit. (Personally, I make the three lines a separate paragraph.)
avakar
Better yet, use exceptions.
John Dibling
@John: Exceptions aren't always suitable. For one obvious example, `while ((ch=getc())!=EOF)`.
Jerry Coffin
@Jerry: You could also say that return codes aren't always suitable. Nothing is *always* suitable.
John Dibling
@John: while it's true that nothing is always suitable, there are a *lot* of times that you need something like this, and an exception isn't suitable.
Jerry Coffin
@Jerry: When calling external code like your obvious example does, sure. The other side of that coin is there are a lot fewer times when this is needed if your own code is designed to consistently use exceptions instead of return codes, *and* this gives the added benefit of ensuring that your own error conditions are not so easily ignored by accident. But we don't need to get in to a debate about when/how to use exceptions here. And if you're arguing with me because you think I d/v'ed you, know that I didn't.
John Dibling
@John: I'm not really trying to argue at all, only to point out that while you're right that exceptions *can* help clean up code like this, they're not always the right answer. Pointing out that they can be helpful is fine, but the original comment of "better yet, use exceptions" needs qualification.
Jerry Coffin
A: 

I often prefer the first form. I couldn't say exactly why, but it has something to do with the semantic involved.

The second style feels to me more like 2 separate operations. Call the function and then do something with the result, 2 different things. In the first style it's one logical unit. Call the function, save the temprary result and eventually handle the error case.

I know it's pretty vague and far from being completely rational, so I will use one or the other depending on the importance of the saved variable or the test case.

tristopia
It is two separate operations. An assignment and a comparison. Shouldn't it feel that way?
John Dibling
Yes, but the importance of the operation is depending.If the assignment is only there, to hold a temporary (for cleaning up for example), it's not an essential part of the algorithm. If the purpose of the operation is this assignment, then I will write it in separate statements. It's the relative importance that makes the difference.
tristopia
+4  A: 

I would always go for the second. It's easier to read, there's no danger of omitting the parentheses around the assignment and it is easier to step through with a debugger.

JeremyP
+1 for ease of debugging in the second example.
Paul Stephenson
+10  A: 
Alok
Thank you for answering. I personally would write the loop as `for(;;){int c = getchar(); if (c == EOF) break; /*stuff*/}`, which has no redundancy and still maintains one operation per line.
avakar
@avakar: Good point. But, I tend to avoid "infinite" loops with `break` in them unless necessary. I am not sure of this about the `for` loop you presented. Of course, you think it is clearer, or you wouldn't write your loop that way :-). So, `while ((c = getchar) != EOF)` is still the best form for me.
Alok
@Alok, basically what it comes down to is that we need to do something (pre-statements), check the loop condition, then do something more (post-statements) -- and we have no corresponding loop construct for that. Thanks to the nature of C and C++ grammar, you can put the pre-statements into the condition itself, but you'll find it less readable as you increase their complexity. The `break` approach avoids this, but of course there are opponents of breaks (and other goto-like constructs). Indeed it is a very subjective matter :)
avakar
+1 - these loops tend to occur less often than if statements. so, even though I prefer the 2nd form for `if`, I've often used this idiom for `while` to avoid duplicate expressions.
mdma
@avakar: I agree - style is a personal preference, and there are good arguments for both the styles in this case. As I said, I *tend* to use the first form, because I am more comfortable with it, and because in loops it seems to be the most natural construct to me. But if the first form results in something that's "too complicated" (for my tastes obviously), then I use the second form. So, my rule is to make sure I like it and it is easy for me to understand, preferring the first form in case of a tie! :-)
Alok
+1  A: 

I often find the separation of the assignment out into a different line makes debugger watch or "locals" windows behave better vis-a-vis the presence and correct value of "s", at least in non-optimized builds.

It also allows the use of step-over separately on the assignment and test lines (again, in non-optimized builds), which can be helpful if you don't want to go mucking around in disassembly or mixed view.

YMMV per compiler and debugger and for optimized builds, of course.

leander
+1  A: 

I personally prefer for assignments and tests to be on different lines. It is less syntactically complicated, less error prone, and more easily understood. It also allows the compiler to give you more precise error/warning locations and makes often makes debugging easier.

It also allows me to more easily do things like:

int rc = function();

DEBUG_PRINT(rc);

if (rc == ERROR) {
    recover_from_error();
} else {
    keep_on_going(rc);
}

I prefer this style so much that in the case of loops I would rather:

while (1) {
    int rc = function();
    if (rc == ERROR) {
        break;
    }
    keep_on_going(rc);
}

than do the assignment in the while conditional. I really don't like for my tests to have side-effects.

nategoose
I'd rather have my teeth extracted with a chisel than have a loop exit condition in the loop block. But that leaves me with a dilemma in that I either have to run the bit of code that generates the loop exit twice (once before the loop and once at the end) or put an assignment in the loop condition. I guess your example is OK because the exit condition is at least directly below the while.
JeremyP
`int rc;``goto in_the_loop;``do {`` keep_on_going(rc);`` rc = function();``} while (rc);`
nategoose