ansaurus

Question

C programming: is this undefined behavior?

Answer 1

+13 A:

Output will be 2 always.

Here printf is receiving 2 parameters but printing only the first one. Now the order of function argument evaluation is undefined in C so the printf might receive (2,3) or (2,2). In either case only 2 will be printed.

If the printf is changed to printf("%d %d", ++x, x+1) then the result will be undefined.

codaddict 2010-08-10 15:28:31

@Downvoter: Care to explain?

codaddict 2010-08-10 15:36:26

@codaddict: The code invokes Undefined Behavior because one of the accesses of `x` i.e `(x+1)` has nothing to do with the value that gets finally stored in `x` and so there's no good way to define either for our understanding or the compiler's whether the access should take place before or after the incremented value is stored. .Also the order of evaluation of arguments in a function call is unspecified (not undefined). However you should not rely on the output. And I haven't downvoted.(Infact I have stopped downvoting).

Prasoon Saurav 2010-08-10 15:39:18

Also I came across this piece of code in C++ . `int a=1;++a=6;std::cout<<a;`. Although the code invokes Undefined Behavior, the program always outputs 6. That doesn't mean that we should rely on the output of such programs.

Prasoon Saurav 2010-08-10 15:46:10

@Prasoon: And you know the program always outputs 6, on all conforming implementations, how?

David Thornley 2010-08-10 16:03:44

@David: By `always` I did not mean that it has to print 6 under any cirsumstances. But on most implementations the output is 6(be it g++, MSVC++, or CLang). I remember me and my friend almost fought over it. He asked me to show any such implementation where the output was different and I couldn't.

Prasoon Saurav 2010-08-10 16:07:48

@David: Can you name any such implementation where the output of the code that I have given is different from 6? But yes Undefined Behavior means anything can happen, who knows?

Prasoon Saurav 2010-08-10 16:13:09

all that apart, codaddict answer (at least the text I can read now) is good +1; btw gcc (and surely other compilers) has an option to check also the printf format against the arguments passed, this way the presence of one % with two passed args would give at least a warning, saying that there's something odd and _maybe_ something can go wrong... not in this case anyway

ShinTakezou 2010-08-10 16:17:20

@Prasoon: So, it outputs 6 with every compiler you have handy. That's very likely. It doesn't mean it outputs 6 with every compiler in the world, let alone every possible one. Look at the statement: the `++a` creates a value of `a+1`, which is not used. The other effects are assigning `a+1` to `a` and assigning 6 to `a`, and C says nothing about which goes first; instead, the behavior is undefined. I'd suspect that a register-heavy machine would be at least slightly more likely to output 2, but I may well be wrong.

David Thornley 2010-08-10 16:18:02

@codaddict: Just because you've passed an argument that `printf` will ignore doesn't mean it's not evaluated. And since both integer arguments are evaluated, the behavior is undefined.

Gilles 2010-08-10 16:18:35

@ShinTakezou: no, the answer is wrong. See my or Jerry Coffin's answer, or Prasoon's comment, for why.

Gilles 2010-08-10 16:20:02

@ShinTakezou: Except that this answer is factually incorrect. The program's behavior is undefined, but the output will almost always be 2.

David Thornley 2010-08-10 16:21:05

@David: I know why it is undefined. :) Anyhow the code that I have given won't even compile in C because ++a returns an rvalue expression in C which we cannot assign to. :)

Prasoon Saurav 2010-08-10 16:23:46

@codaddict: Incorrect answer. The code produces undefined behavior. Your answer would be correct if the behavior were *unspecified* (i.e. even though it is unspecified, the result is always 2). But in this case the behavior is *undefined*, which is a completely different thing.

AndreyT 2010-08-10 16:44:09

@codaddict: You still haven't corrected `Now the order of function argument evaluation is undefined in C`. Correct it otherwise you will get more downvotes

Prasoon Saurav 2010-08-10 16:48:52

@Gilles a lawyer knowing and considering only the standard, will say it is undefined behaviour; and in general it is true. But that code will output 2 __always__, no matter the implementation (if good C impl of course). So if you have a program that outputs always the same output given the same input, do you label it "undefined behaviour"?

ShinTakezou 2010-08-10 16:56:59

@ShinTakezou: a correct implementation can print something other than 2, I sketch a more or less realistic example in my answer. A cartoony, but correct, implementation, could say "This does not compute" and explode.

Gilles 2010-08-10 17:24:30

I still can't understand why this answer is wrong. `x + 1` is evaluated and has an undefined value which _we ignore_. `++x` is evaluated and has a *defined* value, which we use. The order of evaluation is undefined, but all parameters must be evaluated *before* calling the function... who cares if `x+1` is 2 or 3 by then?

UncleZeiv 2010-08-10 18:19:51

@UncleZeiv: Where did your get these strange "`x + 1` is evaluated and has an undefined value" and "`++x` is evaluated and has a defined value" ideas? No, the code produces *undefined behavior*. This means that there's no way to predict what is evaluated and what is not, how it is evaluated and whether anything is evaluated at all. Nothing is defined here. In fact, as it has already been noted, the compilers are even allowed to refuse to compile this code.

AndreyT 2010-08-10 18:25:29

@Prasoon: reading a non-volatile variable never causes undefined behavior. It may give an implementation defined value, but if the programs behavior does not in any way depend on that value, the result is well defined.

Chris Dodd 2010-08-10 20:43:04

@Chris: reading a variable *can* cause undefined behavior. One case, which is being discussed here, is when the variable is both read and written to between two sequence points. Another case is that reading a trap value is also undefined behavior (e.g. merely assigning a `free`d pointer can crash a program on some exotic machines).

Gilles 2010-08-10 21:09:22

Indeed a very good checking C implementation would abort execution when it reached this line - and would be perfectly standards conforming.

caf 2010-08-11 02:28:12

@AndreyT: ok, now by reading all the answers I understand how this works. Subtle, but very interesting.

UncleZeiv 2010-08-11 15:40:28

@UncleZeiv and mimic-ing standard's words, all these answers are still "undefined answers" (not "unspecified answers"); likely none of those is completely right. But this one is one of the most close to reality (if you read other comments you'll discover I am omniscent, so I can say that)

ShinTakezou 2010-08-13 08:36:13

@Gilles re "reading a variable can cause undef. behaviour...". At the end there's produced code... Those "sequence points" finish to be just a sequence of instructions; how that sequence of instructions could have undef. behaviour if one just read the value from memory and another write to it? At most it would be unspecified, as the order of the instruction does not need to be "x,y" instead of "y,x"; we've already said elsewhere that if it would be just a matter of order of evaluation and USB, the output would be always 2 (if the compiler does not refuse to compile the code)

ShinTakezou 2010-08-13 08:43:27

@ShinTakezou: There are C implementations that compile not to machine code but to *hardware* (e.g. C→FGPA or C→VHDL). With many kinds of electronic memories, if you try to read and write at the same time, you see an unstabilized signal that cannot be reliably interpreted as 0 or 1. This could even cause a feedback loop that causes the write to put random garbage instead of the desired value. This is an example of “undef. behaviour if one just read the value from memory and another write to it”.

Gilles 2010-08-13 10:15:53

Answer 2

+2 A:

Most students said undefined behavior. Can anyone help me understand why it is so?

Because order in which function parameters are calculated is not specified.

Vladimir 2010-08-10 15:28:59

but the order is irrelevant because only one parameter is being printf'd

KevinDTimm 2010-08-10 15:31:09

That's irrelevant; the fact that there is a `++x` and `x+1` without an intervening sequence point is what makes it undefined.

David Thornley 2010-08-10 16:04:34

those std definitions compliances make me always (m/s)ad. Since printf will use, driven by "%d", only the first arg passed, which is btw the only one that modifies x causing side effects. Since x+1 does not change x, it can be evaluated before, or later, or never, and the result of ++x won't change; and since ++x is the only one taken, the result/behaviour is not undefined, unspecified or any other std-word. It is always that, and no matter the implementation; printf is (fmt, ...) and though compilers can check if the fmt matches the extra args, this is not mandatory, nor always desired

ShinTakezou 2010-08-10 16:32:38

@ShinTakezou: If you don't like paying attention to the Standard, that's one thing, but you really shouldn't get into discussions like this if you don't. Moreover, if you combine that attitude with a willingness to use non-conforming code, sometime you're going to get bitten by an assumption you made, and the compiler writers are going to have no sympathy.

David Thornley 2010-08-10 16:37:34

I am saying that interpreting standard in these cases is like interpreting law; there are lawyers that can make the same thing once good for the law, and once bad. Printf is a vararg func, so three args are ok. The result of one (x+1) is undefined because of the side effect of the eval of the other (++x) and the fact that std does not dictate an order of evaluation. But printf, because of %d, will actually take only the result of the evaluation of ++x, which will be __always__ 2, since x+1 has no side effects. So the result will be __always__ 2.

ShinTakezou 2010-08-10 16:48:43

@ShinTakezou: And I am saying that this is undefined behavior, according to the letter of the standard, and nobody familiar with the standard will insist on telling you anything else. There are legitimate arguments about various standards, and this is not one of them. The fact that printf() will use only one value is irrelevant here as far as the standard is concerned. The fact that one thing is actually undefined, even if it isn't used, means the whole thing is undefined. I would be very surprised to find an implementation that didn't output 2, but it would be within its rights.

David Thornley 2010-08-10 16:59:09

@David prolog: no-one would consider good code `printf("%d", x+1, x+1)` though "defined behaviour" (or printf descr in the std does not say "extra" args, with respect to fmt, are ignored?).Let's take a vararg func like xyz.If you're asked about `xyz(f, ++x, x+1)`'s output, you must say undef. behaviour, by std... in this case you mean you can't be able to predict output, since you don't know what xyz does. But here the Q is simply about printf and what is its output __always__. The correct answer is 2, ragardless of what the std say. There can't be any C impl that produces different output

ShinTakezou 2010-08-10 17:14:21

@ShinTakezou: There certainly can be a conforming C implementation that produces different output. How likely that is is another question. If you really want me to, I'll write a "compiler" that scans for those exact characters and prints "42" if it finds them, otherwise passing it off to a real compiler. Assuming the real compiler is conforming, mine is conforming.

David Thornley 2010-08-10 17:18:44

@David do it please; and of course, it must be able to compile also the "defined behaviour" code. I say that a correct C compiler (according to the same std that labels this undef behaviour) can't output something different from 2.

ShinTakezou 2010-08-10 17:21:56

@ShinTakezou: A C compiler that's correct according to the standard can do anything it likes on encountering any undefined behavior. I can write a Perl script that takes a program, scans it for those characters, and if it doesn't find them passes it to a real C compiler. That means that it properly compiles all conforming code, since that code isn't conforming. You will not find anything against it in the standard. I will freely admit that it's a pointless implementation for most purposes, but it is conforming. It is correct according to that standard.

David Thornley 2010-08-10 17:29:29

@David oh sorry missed the "build a compiler that forces wrong results on code that is defined undef behaviour from the std". I could write code that your compiler thinks is undefined behaviour according to the std, but it indeed is not... and this would make your compiler not standard compliant.

ShinTakezou 2010-08-10 17:32:39

@ShinTakezou: David said his script scans for instances of UB, you're argument is that his script is sometimes wrong but the basis of his argument was a correct script. You can't construct a valid counter-argument from this position, it's illogical.

Charles Bailey 2010-08-10 17:43:23

@Charles Bailey have you ever read Godel Escher Bach? There's always an escape:D You would be right, if you think it can exist an implementation of such a script able to catch __all__ the "undefined behaviour" code, digest it someway, and then pass it to a real compiler. It is __not__ illogical to think that may exist a code, designed to mislead David's script, so that it makes "undefined" a code that is perfectly defined, so demonstrating that the script is not a standard-compliant compiler. I've already written it in the prev comment, with less words.

ShinTakezou 2010-08-10 18:02:37

@ShinTakezou: How can you mislead his script? You can only do this if the script is incorrect. In fact, for David's argument to hold it's not necessary for the script to diagnose all UB (probably impossible as some UB depends on program input); just UB of this form or even just this particular construct would be sufficient.

Charles Bailey 2010-08-10 18:20:53

@Charles: Right. All it has to do is scan for `printf("%d",++x,x+1);`. I can't write a general UB detector, that's impossible (some UB detection is equivalent to the halting problem), but that particular statement is legally UB. Therefore, it processes some UB in a silly way, and passes all else, including lots of UB, to a real compiler. It's conforming, if the real compiler is.

David Thornley 2010-08-10 19:23:46

@Charles, help him writing the script, send it to me, and I will try to mislead it. I don't know if it would prove anything, but it would be funny. Scanning for that particular line would change only that particular code! You have to pick the standard, and write a parser to parse at least that single case that makes that particular code UB: you have to catch the "feature" that makes it UB, not the exact code! It could be mislead anyway too - for example through macros

ShinTakezou 2010-08-10 20:42:27

@ShinTakezou: Of **COURSE** it only changes that particular code. I can't possibly detect all UB, and for this purpose it isn't worth my while to do parsing or any further analysis. If I read 7.1.3 of the draft C99 standard aright, you can't `#define printf(` legally, so I don't have to worry about that.

David Thornley 2010-08-10 21:39:54

@David your only escape would be to pre-pass the code to cpp, or rewrite a cpp in your script, otherwise you'll never be able to catch that UDB;I don't need to define printf(, a lot of "dirty" tricks are possibles and make your work impossible (unless you use cpp or rewrite cpp as already said) - btw vladimir's answer is wrong since it talks about USB; if this would be only USB depending on order of eval, then my answer would be the right one (changed all UDB into USB and removed the "mixing"); if this would be just USB, the answer would be 2, w/o further possible discussion.

ShinTakezou 2010-08-12 14:42:22

@ShinTakezou: Wrong. All I have to know is that `printf("%d",++x,x+1);` is UB. Therefore, I can do a simple textual scan for that. I'll miss a whole lot of UB, but that doesn't matter. And, yes, if this was USB the answer would be 2.

David Thornley 2010-08-12 14:57:41

@David Wrong? Seriously?! Which part? You won't find that string to scan for, if I use macros! And if you preparse it through cpp, ... I've already said it's your only escape... So, which part is wrong?! Moreover you can't believe it is so simple - it could be simple, but not that simple: I could write x=x+1 instead of ++x, the "UDB" is __exactly__ the same, but you've to add code to your script. Textual scan of a complex syntax is not that easy; beyond the x=x+1 trick, what about comments? you must skip them... taken already into account? it seems not>>

ShinTakezou 2010-08-13 08:09:19

@David moreover, reading [this](http://en.wikipedia.org/wiki/Sequence_point) I start to argue that the output is always 2, even if it is UDB - if you want, you can help another user to write a compiler to demonstrate it is not so (he wanted 3 years and a grant - or maybe was you? I can't remember names and too lazy to check), and I'll take 3 years to check your compiler carefully - I bet it won't sell too much... - since a compiler that refuses UDB and USB for sure, but may fail compiling legal code, is not so useful.

ShinTakezou 2010-08-13 08:12:49

Answer 3

+2 A:

What output will it always produce ?

It will produce 2 in all environments I can think of. Strict interpretation of the C99 standard however renders the behaviour undefined because the accesses to x do not meet the requirements that exist between sequence points.

Most students said undefined behavior. Can anyone help me understand why it is so?

I will now address the second question which I understand as "Why do most of the students of my class say that the shown code constitutes undefined behaviour?" and I think no other poster has answered so far. One part of the students will have remembered examples of undefined value of expressions like

f(++i,i)

The code you give fits this pattern but the students erroneously think that the behaviour is defined anyway because printf ignores the last parameter. This nuance confuses many students. Another part of the student will be as well versed in standard as David Thornley and say "undefined behaviour" for the correct reasons explained above.

Peter G. 2010-08-10 15:33:53

The behavior is undefined because `x` is assigned to, and referenced in a different context, within the same sequence points. The output will almost always be 2. There is no requirement in the C standard that makes that so.

David Thornley 2010-08-10 16:19:35

@David I checked see standard and technically you are 100 % right. I will correct my answer now.

Peter G. 2010-08-10 16:41:12

@Peter G.: "...but the behaviour is defined anyway...". How does it become "defined anyway"? Undefined behavior in this case is not in any way "attached" to the last parameter of `printf`. Just because `printf` ignores it does not suddenly make the behavior defined.

AndreyT 2010-08-10 16:55:24

@Andrey Thanks. In the revised answer I meant to say the students will be believe it to be defined anyway (like I did not long ago ...). I corrected that now in the revised^2 answer.

Peter G. 2010-08-10 17:24:51

I need to stress that my +1 was for "The code you give fits this pattern but the behaviour is defined anyway because printf ignores the last parameter". Even though, for std-lawyer, you should stress the fact that you are using "defined behaviour" in the common sense, not as for std-definition of what is not undefined-behaviour. A program that, given an input, emits always the same output, so being predictable, shouldn't be defined "undefined behaviour", even if formally it could be.

ShinTakezou 2010-08-10 17:28:06

@ShinTakezou: Except that, without detailed knowledge of every C implementation that purports to be conforming, it's not possible to know that that program will always have the same output.

David Thornley 2010-08-10 19:25:55

@David you don't need detailed knowledge of every C impl to say that, exactly how you don't need that knowledge to argue that `printf(x+y)` will output x+y even on non standard compliant impl of C.

ShinTakezou 2010-08-10 19:36:59

@ShinTakezou: Okay, what are you reasoning from? It sure isn't the standard, and you've admitted it isn't from detailed knowledge of every C implementation there ever was. Statements of universals are generally based on fundamental principles or exhaustive knowledge, and preferably not on oft-repeated but unsupported assertions. Heck, you haven't shown any problem with Gilles' answer; there's no reason why code generation couldn't do exactly that.

David Thornley 2010-08-10 19:43:54

@David from how people creates parsers and compilers and how what you write becomes low level code. I've shown problem with Gilles' answer indeed, but how depends on the focus he gave his answer and details he gives. There are comments I've not replied yet, but here it's getting dark and I am finishing my ability to write coherently in my fluent english (irony of course)

ShinTakezou 2010-08-10 19:59:21

@ShinTakezou: No, you said that if that generated code was used in another context it would be inappropriate. That doesn't say there was anything wrong with Gilles's answer. When you come back, please provide some argument that does not imply omniscience on your part.

David Thornley 2010-08-10 21:27:31

@David who assure you I am not omniscent? You should be omniscent, to say I am not. I've said the code would make a thing like (x+1, x+1) fails. Don't forget the code is generated "automatically", not by a human; we have to imitate the process, without adding human ability to cheat. If the process follows the same path (I expect this), it would use the register as the original value also in the shown case (x+1,x+1). The context is not different: there are just different exprs. If that path is an exception for the presence of ++x, then why the presence of a ++x should trigger that exception?...

ShinTakezou 2010-08-12 14:52:45

... I can't see a reason. The most "stupid" "generator" would translate ++x into, say, "mv (x), r0; add #1, r0; mv r0, (x); push (x)", and x+1 into, say, "mv (x), r0; add #1, r0; push r0"; in both cases r0 is a scratch register and any optimization must not mix stuffs, the operation ++x in this is not special. In a generator that "looks" at all exprs "in a glance", the question is just when x is updated, and when and if it is picked back; but the generator must track changes to r0 (or whatever), if it is an alias, or as said wrong code can be generated even for legal code; ...

ShinTakezou 2010-08-12 15:01:24

@ShinTakezou: That code would only be generated if there was an assignment, and f(x+1,x+1) does not contain an assignment. Automatic code generation is pretty smart nowadays (actually, it's been pretty smart through the history of compiling, since compilers couldn't have become popular without at least being close to human code generation in efficiency). I cannot understand your last sentence. If the path is an exception in case of assignment, then of *course* an assignment would trigger that exception. Why else would it be there?

David Thornley 2010-08-12 15:02:31

... say it differently: interpret ++x as "x+1 and update it". To update x to x+1, of course x+1 must be computed first. The only added part is the updated process. But being this present or not, the next expression can't reuse the already computed x+1 to calculated "its" x+1, that would then become (x+1)+1. So we just have a "vagabond" update instruction, that will be executed for sure before the end of the statement, but we just don't know when. And this is why, explained this way, it is clear the UnSpecified Behaviour. Recovering the UDB, ignored not only by me, ...I have no explanation... .

ShinTakezou 2010-08-12 15:06:14

@David why should the code be UDF in case there's an assignment, being the assigment just as saying: compute the expression, update the in memory value...? I understand the USB, still refuse the UDB since also the parallel argument does not convince me. Btw printf("%d %d", y = x+1, x+1) is ok and of course is defined and specified behaviour, printing 2 2 for sure. There's an assignment, though to another variable. If this is legal code, it means the problem is not the assignment, but to what it is assigned the result of the expression, i.e. the "update the same memory cell" stuff

ShinTakezou 2010-08-12 15:12:29

@ShinTakezou: Okay, an assignment to x triggers that behavior for x. I have no idea why you refuse to call it undefined behavior, since that's exactly what it is. I have no idea why you insist that all compilers must work the way you imagine them working, particularly since you seem to have an inadequate grounding in code generation (so do I, in this case, but I don't assume I know how it has to work). I have, in a comment to another answer, pointed you towards documents you can study to understand why the Standard is what it is.

David Thornley 2010-08-12 15:25:21

(UDF was of course UnDefined __B__ehaviour)

ShinTakezou 2010-08-12 15:27:07

@David I've seen them and thanks.If you've no ground in code generation,how can you judge mine?Are you omniscent too?I refuse it since it's a reasonless definition. Rules exist not just for fun (since I expect std writers had something in mind!);there must be one, and noone is able to give any (parallel argument apart, but this explains easily USB, while to explain UDB someone expert in parallelism should explain how that 2stuffs "interact"). Now the"OP" can do 2 things: prove he's read the std and can "apply" it blindly, or prove he's able to think and have lots of question about.

ShinTakezou 2010-08-12 15:37:04

Answer 4

+12 A:

Any time the behavior of a program is undefined, anything can happen — the classical phrase is that "demons may fly out of your nose" — although most implementations don't go that far.

The arguments of a function are conceptually evaluated in parallel (the technical term is that there is no sequence point between their evaluation). That means the expressions ++x and x+1 may be evaluated in this order, in the opposite order, or in some interleaved way. When you modify a variable and try to access its value in parallel, the behavior is undefined.

With many implementations, the arguments are evaluated in sequence (though not always from left to right). So you're unlikely to see anything but 2 in the real world.

However, a compiler could generate code like this:

Load x into register r1.
Calculate x+1 by adding 1 to r1.
Calculate ++x by adding 1 to r1. That's ok because x has been loaded into r1. Given how the compiler was designed, step 2 cannot have modified r1, because that could only happen if x was read as well as written between two sequence points. Which is forbidden by the C standard.
Store r1 into x.

And on this (hypothetical, but correct) compiler, the program would print 3.

(EDIT: passing an extra argument to printf is correct (§7.19.6.1-2 in N1256; thanks to Prasoon Saurav) for pointing this out. Also: added an example.)

Gilles 2010-08-10 15:38:43

First, printf with a format of "%d" expects one integer parameter, but you've passed two. The behavior is undefined . No it is not. The second parameter is merely evaluated but the behavior is not undefined because of this reason. Note : printf("%d %d",x++);is Undefined because number of format specifiers is greater than the number of arguments.

Prasoon Saurav 2010-08-10 16:02:01

@Prasoon: thanks, you're right. I've corrected my answer.

Gilles 2010-08-10 16:17:08

What would make it a correct (though theoretical, and so unexistant) compiler? Such a compiler would fail for printf("%d %d", x+1, x+1) too: 1) load x into r1; 2) add 1 to r1 (x+1) 3) add 1 to r1 (x+1)... ops, it's a bugged compiler... once r1 is an "alias" for x into memory, the result of eval of x+1 put in r1 again will be like ++x, that is wrong; or r1 is temporary (meaning that when compiler has to eval ++x it loads x from memory again), or the result of x+1 must not be put in r1 again

ShinTakezou 2010-08-10 17:05:06

@ShinTakezou: On this (hypothetical) compiler, update instructions such as `++` and `+=` happen to be treated specially. The compiler assumes that if `x` has been loaded into `r1`, *and `x` is updated before the next sequence point*, then `r1` still contains the value of `x` by the time the update instruction is generated.

Gilles 2010-08-10 17:21:03

@ShinTakezou: It could easily generate different code for `x+1, x+1`, since neither of those causes a change in `x`. In fact, a good compiler would probably spot those as common subexpressions. The compiler might well want to do something different for calculating function arguments containing assignments.

David Thornley 2010-08-10 17:24:30

This is simply wrong -- at step 3, the value in `r1` isn't `x`, so it can't calculate ++x by adding 1 to it. By this logic the expression `x = (x + 1) - x` might set x to 0 instead of 1

Chris Dodd 2010-08-10 20:39:19

@Chris: the compiler can generate whatever it chooses since the code has undefined behavior. In `x=(x+1)-x`, the rule used in step 3 above doesn't apply since there is no update operation. And if you were thinking of objecting `x+=x` next, the compiler always computes the rhs of the assignment before dealing with the lhs, so it generates correct code too; that reasoning doesn't apply to `printf("%d",++x,x+1)` because there is no dependency between the `++x` part and the `x+1` part.

Gilles 2010-08-10 21:05:59

ShinTakezou 2010-08-12 07:14:20

@Gilles and "the compiler can generate whatever it chooses since the code has undefined behavior" can't be right. It seems like programmers read std, and then say: oh, this is UDB, so let's generate odd code!! No, there must be no exceptions in code generation created purposely to create a UDB! Rather, at most UDB is a consequence, from the compiler writers PoV, of a "relaxed condition", check or "no assumption about" that are always at work.As already said, your generated code would produce wrong result even for (x+1, x+2) or whatever

ShinTakezou 2010-08-12 07:27:44

@ShinTakezou: If you give me three years and a research grant, I will write a compiler that works in this way. I've already explained why the compiler generates correct output on the examples on correct code that you give.

Gilles 2010-08-12 12:04:11

@Gilles I won't use never your compiler since I can't trust it, it would be likely unmaintenable code full of hard to track "exceptions", ... as said, that UDB must be a consequence of something else you're a doing (and that likely gives benefits) "between" source and machine code output; it can't be engraved purposely with a specific code to handle that kind of source code! Your example code is still an impossible output for a normal compiler, since (again and again) the problem is reusing the result of eval(x+1) as x, not ++x itself (result from p.2 is used in p.3 as x)

ShinTakezou 2010-08-13 07:40:59

@ShinTakezou - it would still be a standards compliant compiler, which means you can always trust it if you don't write code with undefined behavior. The UDB is not a "consequence" of something a compiler writer does, but is the potential result of writing code that does not obey the rules.

detly 2010-08-13 07:59:08

@ShinTakezou: *Any* optimizer is full of hard-to-track exceptions. The benefit of an optimizer is faster/smaller code. There is no problem with my compiler: since you give it invalid input (a C program with undefined behavior), the output can be whatever it likes.

Gilles 2010-08-13 10:17:47

Answer 5

+36 A:

The output is likely to be 2 in every reasonable case. In reality, what you have is undefined behavior though.

Specifically, the standard says:

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

There is a sequence point before evaluating the arguments to a function, and a sequence point after all the arguments have been evaluated (but the function not yet called). Between those two (i.e., while the arguments are being evaluated) there is not a sequence point (unless an argument is an expression includes one internally, such as using the && || or , operator).

That means the call to printf is reading the prior value both to determine the value being stored (i.e., the ++x) and to determine the value of the second argument (i.e., the x+1). This clearly violates the requirement quoted above, resulting in undefined behavior.

The fact that you've provided an extra argument for which no conversion specifier is given does not result in undefined behavior. If you supply fewer arguments that conversion specifiers, or if the (promoted) type of the argument disagrees with that of the conversion specifier you get undefined behavior -- but passing an extra parameter does not.

Jerry Coffin 2010-08-10 15:59:56

@Jerry: Congratulations for writing a correct and clear answer!

Gilles 2010-08-10 16:21:41

`The output is likely to be 2`. +1 for the *likely* word. :)

Prasoon Saurav 2010-08-10 16:26:31

One reason that you are likely to get `2` in most cases is because the compiler may optimize away the un-referenced parameter. The statement would then be simplified to `printf("%d",++x)`, whose behavior is defined.

bta 2010-08-10 16:26:44

@bta only if the compiler actually analyzes the format string beforehand, which is only possible if the format string is a literal, and only if the compiler is smart enough to attempt it.

David Lively 2010-08-10 16:36:38

I am sad that my english is not that good to catch that lawyer like speech in standard... But the result will be always 2, not likely 2, in any implementation that can be defined implementation. This is because the "undefined behaviour" affects x+1, which is not taken at all; and I can't imagine any impl. of stdargs, nor actual code for a printf, that would be able to "move" the undefined behaviour on ++x without broking the whole implementation in those situation that std can define as defined behaviour - i.e. without making the whole compiler unusable

ShinTakezou 2010-08-10 16:40:24

@ShinTakezou: the standard is very clear that if you execute *anything* with undefined behavior, *all* the behavior of the entire program (including things that happened *before* the undefined behavior actually happened) is undefined.

Jerry Coffin 2010-08-10 16:53:27

@bta: Doesn't this just mean that the third parameter could result in either 2 or 3, but that the second parameter *always* evaluates to 2? Since `x+1` does not modify `x` at all, doesn't this mean that `++x` is the only modification done to x between the sequence points, and therefore *must* evaluate to 2? In other words, since `x+1` is an expression with no side effects, isn't its presence or absence irrelevant in terms of evaluating `x++`? Technically the whole statement is still undefined behavior, but since the undefined bit is actually an ignored extra parameter with no side effects..

kyoryu 2010-08-10 16:54:11

@ShinTakezou: You are incorrect; it is entirely possible for a conforming implementation to print "42". It just is really unlikely. Nor do I trust your imagination as covering all possible compiler implementations.

David Thornley 2010-08-10 16:54:47

@bta: Never mind, Gilles' answer provided a clear explanation of how you could get to printing out "3".

kyoryu 2010-08-10 17:02:56

-1 This is wrong because there is no undefined behavior here, as long as `x` is not volatile -- reading a non-volatile value is NOT a side effect. When there is another side effect like this, the value read from `x` is undefined, but as that value is unused, it doesn't matter.

Chris Dodd 2010-08-10 20:35:36

@Chris: Where do you see a mention of side effects in: "Furthermore, the prior value shall be read only to determine the value to be stored"? Lacking such a restriction, this applies to all code.

Jerry Coffin 2010-08-10 20:42:24

@Chris: Carefully read this article: http://c-faq.com/expr/seqpoints.html

Prasoon Saurav 2010-08-11 01:53:42

This sentence sums it up pretty well: *This rule effectively constrains legal expressions to those in which the accesses demonstrably precede the modification.*

detly 2010-08-13 08:09:40

ShinTakezou 2010-08-13 08:28:39

@ShinTakezou - it actually refers to the bit starting *"Furthermore, the prior value..."*

detly 2010-08-13 08:32:05

Answer 6

+8 A:

The correct answer is: the code produces undefined behavior.

The reason the behavior is undefined is that the two expressions ++x and x + 1 are modifying x and reading x for an unrelated (to modification) reason and these two actions are not separated by a sequence point. This results in undefined behavior in C (and C++). The requirement is given in 6.5/2 of C language standard.

Note, that the undefined behavior in this case has absolutely nothing to do with the fact that printf function is given only one format specifier and two actual arguments. To give more arguments to printf than there are format specifiers in the format string is perfectly legal in C. Again, the problem is rooted in the violation of expression evaluation requirements of C language.

Also note, that some participants of this discussion fail to grasp the concept of undefined behavior, and insist on mixing it with the concept of unspecified behavior. To better illustrate the difference let's consider the following simple example

int inc_x(int *x) { return ++*x; }
int x_plus_1(int x) { return x + 1; }

int x = 1;
printf("%d", inc_x(&x), x_plus_1(x));

The above code is "equivalent" to the original one, except that the operations that involve our x are wrapped into functions. What is going to happen in this latest example?

There's no undefined behavior in this code. But since the order of evaluation of printf arguments is unspecified, this code produces unspecified behavior, i.e. it is possible that printf will be called as printf("%d", 2, 2) or as printf("%d", 2, 3). In both cases the output will indeed be 2. However, the important difference of this variant is that all accesses to x are wrapped into sequence points present at the beginning and at the end of each function, so this variant does not produce undefined behavior.

This is exactly the reasoning some other posters are trying to force onto the original example. But it cannot be done. The original example produces undefined behavior, which is a completely different beast. They are apparently trying to insist that in practice undefined behavior is always equivalent to unspecified behavior. This is a totally bogus claim that only indicate the lack of expertise in those who make it. The original code produces undefined behavior, period.

To continue with the example, let's modify the previous code sample to

printf("%d %d", inc_x(&x), x_plus_1(x));

the output of the code will become generally unpredictable. It can print 2 2 or it can print 2 3. However note that even though the behavior is unpredictable, it still does not produce the undefined behavior. The behavior is unspecified, bit not undefined. Unspecified behavior is restricted to two possibilities: either 2 2 or 2 3. Undefined behavior is not restricted to anything. It can format you hard drive instead of printing something. Feel the difference.

AndreyT 2010-08-10 16:46:51

(I am starting another fight against blind std affection, sorry). __But__ the OP Q is: what is the output? And the output is __always__ 2. In general, `xyz(f, ++x, x+1)` is undef behaviour since we can't say which is evaluated first, and one has a side-effect (modifies x). But in this case, we know printf gets only `++x`, while `x+1` can't modify x, or we would have a broken C impl. So, the output is predictable, and it is __always__ 2

ShinTakezou 2010-08-10 17:18:04

@ShinTakezou: Incorrect. Firstly, when the behavior is *undefined*, *anything* can happen. So, your assertions about what can and what can't happen hold no water whatsoever. Secondly, even when the behavior is defined `x + 1` can do absolutely anything with `x`, as long as the effects satisfy the specification. And yes, `x + 1` *can* modify `x` as long as it reverts it to the original value afterwards. Gilles's answer has an example of such evaluation.

AndreyT 2010-08-10 17:33:32

@ShinTakezou: Thirdly, you are saying that "undef behaviour since we can't say which is evaluated first". This indicates that you do not understand the difference between *undefined* and *unspecified* behavior. You mistake *undefined* behavior for the *unspecified* one and come to meaningless conclusions as the result of that mistake. Undefined behavior in this case has absolutely *nothing* to do with what is evaluated first.

AndreyT 2010-08-10 17:37:19

@ShinTakezou: Finally, a valid and legal manifestation of the undefined behavior in C is simple refusal of the compiler to compile the code. Obviously, it will not "always print 2" in case of such "manifestation" :)

AndreyT 2010-08-10 17:41:08

Logic fault. If x is modified and then reverted to its original value, in the middle can't be anything, so ++x will pick the unchanged x. If something can happen in the middle, (x+1, x+1) could fail too. "nothing to do with what is evaluated first" really? I don't think. It is all about it.

ShinTakezou 2010-08-10 17:44:31

@AndreyT The code is compiled, what compiler are you using?! Of course, if turn on all warnings and warnings in errors, it could happen. But a lot of those are there just to help programmers catching faults in production code, not for educational purposes. It will print __always__ 2. I wait for someone finding a standard compliant C compiler that will output a different output.

ShinTakezou 2010-08-10 17:50:50

and btw, it is not the first time people tell me I don't understand standard definitions of sencente like "undefined behaviour". The fact is that, if these are the consequences, I refuse to understand and use "undefined behaviour", or whatever, in the strict standard compliant way. The OP asked which is the output. The output will be always 2, even though, according to std, the code is "undefined behaviour" (and not because of order of evaluation, but for some other reason I can't catch).

ShinTakezou 2010-08-10 17:56:20

@ShinTakezou: The standard places constraints on conforming programs and conforming implementations for the benefit of all users of the C language. If you're refusing to use the same definitions as everyone else for core concepts in the C language then it's hardly surprising that you disagree with most other people on whether the behaviour is undefined or whether the output is always 2 but there's no point in making that argument. Other people are starting from definitions that you aren't using.

Charles Bailey 2010-08-10 18:11:13

@ShinTakezou: Wrong on all accounts. Firstly, your logic is completely broken. The `(x+1, x+1)` code can't fail simply because the behavior is perfectly defined in this case. There's no point to bring in this example for that reason. The compiler can replace `x + 1` with `++x` where it is allowed, and not replace it where it is not allowed. The compilers and their writers are smart enough to realize that. Why are you trying to portray them as so woefully primitive is not clear.

AndreyT 2010-08-10 18:19:16

@ShinTakezou: Secondly, the OP never asked "what is the output". The OP asked what is the right answer to the question given to the students. The right answer is "undefined behavior". Thirdly, "what compiler I'm using" is completely irrelevant. It is prefectly clear that what I said does not refer to any specific compiler. Any compiler is *allowed* to refuse to compile this code. This is already enough to make my point.

AndreyT 2010-08-10 18:20:34

@ShinTakezou: As for your refusal to understand the standard terms... well, that exactly the reason you can't come up with (or accept) the correct answer to this question.

AndreyT 2010-08-10 18:22:10

@AndreyT I appreciate your effort to make me smarter and I'm gonna read'em more deeply later,now I throw just quick(!) ans.The OP wrote "What output will it always produce ?", I suppose it means he wants to know the right answer to this Q.This is 2 (ok ok, it's UB, but the output is still always2).The fail of (x+1, x+1) descends from __your__ statement, not from mine!You've said x can be actually modified in x+1 too,as if it would be an important matter for UB or whatever. I've said that if it can be modified and used before reverting, __then__ (x+1, x+1) could give wrong results.>>

ShinTakezou 2010-08-10 20:09:18

>> ... if the compiler changes x+1 with ++x, any next use of it would give wrong result. Compilers and creators are smart and it won't happen, unless x is not referenced anymore, in that case, who cares what x would be if "read"? The fact is that the code we're talking of uses ++x explicitly. Or are you saying UB is since x+1 could become ++x too, resulting in `printf("%d", ++x, ++x)` (since x is not used later in the sample code)? - moreover I am confident the output is always 2 (though UB...) since I believe writers are smart, and follow "human" schemes when writing compilers.

ShinTakezou 2010-08-10 20:14:36

@AndreyT and about compilers; find a compiler that refuses to compile UB - even though allowed, programmers did not them? Why? It would be an improvement for code safety! Aren't compiler writers so smart to do it?... How these compilers would catch UBs? Part of my relaxness about the challenge of writing a script that pass unmodified code to a compiler only if it is not UB, is that programmatically catching all UBs is not so easy. Even this simple case... how would you write such a script? Many programmers would be interested in the product if it catchs all UBs! There's market!Let's do it!

ShinTakezou 2010-08-10 20:24:19

Surely I'm dumb,but still I say UB is UD+LwoR (UndefinedDefition + Label w/o Reason).The clear statement is "The reason the for UB is that the two exprs ++x and x + 1 are modifying x and reading x for an unrelated..."; nonetheless this does not explain _why_ technically it is UB. I don't like reasonless definition. The reason can be subjective, but must exist... Now, ++x is explicitly modifying x, x+1 shouldn't (it could, but "atomically",reverting it after using the result of the modification, right?).Why UB/US if not because x+1 can read modified or unmodified value, and we can't say which?

ShinTakezou 2010-08-10 20:32:19

Last thing before to go to bed (halleluja:D): let us suppose the std change so that now your code with function is not USB anymore. It tries to do the same thing of `++x, x+1` as you said. Since `++x` could be changed always into your function, and the same for `x+1` (or to say it differently, these exprs could be always wrapped), can you tell me why the standard should keep the "bare" one UDB? There's the smallest reason? Wouldn't it be a not-necessary "asimmetry"? Is it bound to the syntax parser, or rather to the "translation" into code?

ShinTakezou 2010-08-10 20:48:56

@ShinTakezou: In this context, "undefined" means that the C standard does not define the behavior (in this case, it explicitly doesn't define the behavior). It doesn't need a technical explanation, although it's reasonable to ask why the Standard is the way it is. You don't have to like that state of affairs, but you really should accept it.

David Thornley 2010-08-10 22:09:38

@ShinTakezou: It would be perfectly reasonable for a compiler that was built to include lots of runtime checks of correctness (like a "bounds-checking" compiler) to output `"Program Terminated: object illegally read and modified without intervening sequence point."` instead of `"2"`. Such a compiler would be a standards-conforming.

caf 2010-08-11 02:37:00

@David and @caf I have not to accept it; elsewhere someone told compiler writers have to be smart; standards writer too. There must be a reason why "x++, x+1" is undefined, while "func1( and here there can be a reason since this opens the road to optimizations); USB should be also "++x, x+1", and behave like "func1(or the next standard should fix that absolutely, unless a reasonable motivation to make it undefined is given, and here I can't read one;so these comments are a throw for a better std, or clearer standad...

ShinTakezou 2010-08-12 06:36:02

... and it sounds very logical and normal that there's no an existing compiler wrote by normal programmers that say "program terminated object illegally read and modified without intervening sequence point"...

ShinTakezou 2010-08-12 06:37:17

@ShinTakezou: a C implementation doesn't have to be a compiler. It could also be a verification tool, and this output would be *desired* from a verification tool. Note also that with the OP's code, gcc 4.3.2 emits the warning "operation on ‘x’ may be undefined".

Gilles 2010-08-12 12:08:08

@ShinTakezou: When I googled for C Standard Rationale, I got http://www.computer-books.us/c_2.php. You may wish to read it. If you want changes to the C standard, there is work going on right now, and the documents are publicly available at http://www.open-std.org/jtc1/sc22/wg14/, and the process is open in that anybody who wants to put in the work and can afford it may join. Alternatively, you can find a Committee member and email him. The Usenet/Netnews group comp.std.c might well be in full operation still, I haven't checked. That would be another place to go.

David Thornley 2010-08-12 14:21:45

@Gilles verification tools are built around the rules, but does not explain them. Nor "validate" their existance. Since that sort of rules can't be there just for fun, there must be a reason, I can't see(the parallelism does not convince me, since I can't see a scenario where it would give a UDB instead of the obvious USB); @David thanks for the refs and more. I would also stress the fact that such answers to such questions should always point to them and cite here not only the directly interested words, but also the standard defs of used expressions (UDB, USB, sequence point...)

ShinTakezou 2010-08-12 14:34:48

@Gilles unluckly now I am not on my machine where I have gcc to check it, but I am almost sure to have 4.3.2 and to have compiled the code without warnings (with -std=c99 but without -Wall)

ShinTakezou 2010-08-12 14:36:07

@ShinTakezou: With gcc, you need at least `-O -Wall` to get decent warnings. With no options, gcc optimizes for *compilation* speed. In spite of the name, `-Wall` doesn't mean “all warnings”, it's closer to “all warnings that absolutely everybody would want”.

Gilles 2010-08-12 15:20:23

@Gilles I know that -Wall means not all optimization; but according to your opinion as interpreted by me, that warning for undefined behaviour should be absolutely wanted by everybody.

ShinTakezou 2010-08-13 07:51:23

@AndreyT according to my interpretation of note 3 in this [wikipedia article](http://en.wikipedia.org/wiki/Sequence_point) about sequence point, your code printf("%d %d", inc_x( so if you are used to it, you could fix that and make wikipedia better, or this answer, and make SO better.

ShinTakezou 2010-08-13 07:54:43

@ShinTakezou: I don't see how you could come to such conclusion. If you believe that there's UB here as well, you should be able to come up with an evaluation scenario in which the modification of `x` and the reading of `x` are not separated by a sequence point (or some other UB-producing problem). Can you come up with such a scenario? I don't think it is possible.

AndreyT 2010-08-13 14:39:08

Answer 7

+1 A:

The points made about undefined behavior are correct, but there is one additional wrinkle: printf may fail. It's doing file IO; there are any number of reasons it could fail, and it's impossible to eliminate them without knowing the complete program and the context in which it will be executed.

Thom Smith 2010-08-10 17:37:14

+1 for the irony (or I've caught something it's not there?!:D)

ShinTakezou 2010-08-10 17:52:54

Answer 8

A:

The correct answer is that, from a formal point of view, the code is labelled as "undefined behaviour". This has nothing to do with logic or Superior Laws. Simply, the standard tries to define what can be done and what can't be, or what produces "good" code and what produces "bad" code, what it is guaranteed to work and what is not, and between standard and what actually happens there are people that have to interpret the standard, in order to write a compiler or in order to write reliable code or to judge someone else code.

So, from this point of view, there's nothing to be understood: the standard defines the code as "undefined behaviour" (of course, indirectly: your code it not present as example in the standard's papers!). You learn the standard, apply it blindly to analyse your code, and say if it is "good" or not. Formally and blindly often are two dangerous words, so let's take a look at your code of our own. Since after all, for languages that has one, it is the standard that defines the language, and not viceversa (even though common practice or implementation can trigger a modification of the standard), let's start from a code that is considered good.

int x=1;
printf("%d %d",x+1,x+1);

Compiler does not need to know anything about printf, even though extensions allow to say if a function is a "printf-like" function, to add extra checking... but these are compiler tricks to help programmers, not mandatory features. Moreover, printf is described in the standard (not the language part) too, and this is why I added the %d: the code of printf, using the stdargs variable argument feature (another stuff described by the standard), scans the first argument to use properly the others.

So, if the descriptor (%d) and the argument (x+1) does not match, we are doing something wrong. I don't know if the standard labels this "undefined behaviour". However we can be sure strange things may happen. (Compiler may check for it, but the feature is not mandatory, more likely it is an "extension" provided by many if not all compilers).

Now let's take a look at printf. In order to pick its arguments, it will use va_start, va_arg and va_end. Let's take into account only va_arg, which is a macro by the way. It is the way you can pick the next argument and cast it to the datum you expect. The cast is driven by the "%d", "%s" and so on, and the cast determines also the size of the argument, so that the macro can access the next.

Knowing this, you can easily imagine that if things do not match, odd things may happen.

Now, what if we call va_arg once more (with respect to the actual number of argument passed)? Of course, we go and pick a datum on the stack (or wherever) that do not "match" anything real or usable.

What does it happen if we pick one less? Nothing special! It is perfectly doable and "legal". Of course, it could be a clue about the fact that you're doing something wrong, but it could be also intentional (in general, in the printf particular case, it would be only odd)... Nonetheless, you can do it; again, compilers may treat printf and similar in a special way and check if each "%" has its matching argument(s) and warn if more (a dangerous case) or less (the no-harm case) arguments are passed.

So

int x=1;
printf("%d",x+1, x+1);

is still good.

Now let's talk briefly about side-effects. Considered as "atomic" expression, both x+1 have no side-effects. The x is unchanged. Someone in comments said that implementation could indeed modify x in memory. This is ok, if and only if the modification is reverted and all "atomically", i.e. the modification can't be seen outside the expression. If it could happen, even the perfectly legal code I've written so far could fail.

Now, let's modify the code to change it into the bad "undefined behaviour".

int x=1;
printf("%d",++x, x+1);

Why is this code "undefined behaviour"? It is so since standard says so. No special reason, but indeed the fact must be a consequence of something; currently the only reason I can see is the order of evaluation; no matter what other people says here, at least currently, since explanations are citing the standard "rule" and not the reason behind the rule (the parallel argument fits "Unspecified Behaviour" well so that the existance of that Undefined Behaviour is still unjustified - this is also why in this answer it will seem I am confusing Undefined with Unspeficied still)

Moreover in a lazy attempt to pick a fast definition for "sequence point", I landed on wikipedia, where the note 3 cite the same piece of the standard cited in another answer here, but with the observation that "Accessing the value of j inside f therefore invokes undefined behavior", and I interpret this so that printf("%d", inc_x(&x), x_plus_1(x)) is Undefined Behaviour too, and not only Unspecified behaviour, as said in that answer.

If the standard would dictate that expressions must be evaluated in order, "from left to right" in out example, we could be able to predict the passed arguments. In fact, first will be evaluated ++x, and the second arg pushed would be 2, and x would be now 2; then x+1 would be evaluated as 2+1, i.e. 3, and the third argument would be 3.

Given an order of evaluation, we can predict the real values of the arguments.

It is worth noting that many languages do not allow assignments into expression (++x could be read as x = x + 1). If we would be forced to write x++; printf("%d %d", x, x+1) there wouldn't be problems...

Anyway... If the order of evaluation is not forced upon implementation, each implementation can pick up one; even the same implementation, according to optimizations or for whatever reasons, could use different order of evaluation in the same code.

So we have two possibilities, as already written by another user: (2, 2) (first x+1, then ++x), or (2, 3) (first ++x, then x+1). Other options are not possible since they would break even "legal" code.

But your printf ignores (and vararg functions can do it legally) the third value, so it happens that the output is predictable and it is always 2.

Are there any reason why one could say that indeed "special" implementation could output a value different from 2? Theoretically could an evil programmer write a compiler that purposely gives "random outputs" when an UB, as defined by the standard, is met?

The short answer is no, since doing so the compiler could break its ability to compile correctly some "legal" code (it suffices the existance of one of such a code); demonstrating it formally would be long - and I am not sure I am able to. But it is not important.

What I think it shouldn't be forgotten, is the fact that behind UBs or whatever, there are reasons that should sound "logical" within the "system". Saying a priori that a complex expression like printf("%d", ++x, x+1) must be UB, is illogical. It must instead be a consequence of a "relaxed condition", or of a basic "axiom" we can "backtrack".

The complex description provided in the standard by someone

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored

does not explain too much of those reasons, maybe even though we also give the standard definition for "sequence point", "stored value", "evaluation", "expression", to be pedantic.

If I consider printf("%d", ++x, x+1) as an expression and sequence points ++x and x+1 among others, stored value of the object x is modified only once. The whole "expression", modifies the stored value of x only once. So, in this "interpretation" of ungiven definitions, what makes the code bad should be the second part, from "furthermore".... .... ... ...

To teachers students could answer simply: "according to current standard, it is formally undefined behaviour; but implementations not purposely built to work with this particular UB, should output 2 because " (and here you can add the explanation I've given before).

ShinTakezou 2010-08-10 19:32:40

@ShinTakezou: the reason why the standard writers decided not to define the behavior of `f(++x,x+1)` is that they chose to follow a simple principle: during a stretch of parallel execution (i.e., between two sequence points), each variable may have either one write or any number of reads, but not both. This is a common design principle in concurrent or parallel systems.

Gilles 2010-08-10 22:26:29

@ShinTakezou: by the way, what behavior do you expect from `printf("%d %d", ++x, x+1)` and from `printf("%d %d", x+1, ++x)`?

Gilles 2010-08-10 22:27:25

@Gilles I read this after writing the comment about non-existance of an expl. for the choice; but it is not so much clear; I would say USB as "func1(the problem in the parallelization isn't in the side-effects, that are both present in ++x, x+1 and func1( even with parallelization in mind, I can't see any reason to make it UDB instead of USB as caf said for func1( example of non bogus parallelization that justifies that instead of USB would be appreciated

ShinTakezou 2010-08-12 06:48:27

the result from both your code with both %d is unpredictable, since it could be (2, 3), (2, 2), and (2, 2) or (2, 3) according to the order of evaluation of the exprs. I suppose this comment should have been obvious from the answer itself. Since it is unpredictable (i.e. compiled on different compilers can give different results and without trying we can't say which one), it is something to avoid and a clue for wrong programming, but this is not interesting in a educational context.

ShinTakezou 2010-08-12 06:56:26

*The short answer is no, since doing so the compiler could break its ability to compile correctly some "legal" code...* If the compiler found two references to the same `int` variable in the argument list to a function call, and one of them used `++` on the variable, it could emit the code to print an error message about undefined behaviour and then call `abort`. It would be perfectly standard conforming, it would still compile all standard-compliant programs correctly, but it would not output `2` from the program given by the OP. Is that any clearer?

Daniel Earwicker 2010-08-12 12:57:42

You make reference to the order of operations, but in C they're as little specified as possible, and of course there are no requirements for undefined behavior. You are being very dogmatic on limitations of compilers, which are of course free to special-case whatever they like. Also, "sequence point" is defined by the Standard, and "++x" does not cause one.

David Thornley 2010-08-12 15:31:45

@Daniel it's clear and __totally unuseful__ . Making blind agreement to the standard by adding code is silly(1). As said elsewhere, the definition that says it is UDB must be a consequence of choices, that must have"solid" reason(s), that surely is "felt" compilers writers, but here many people insist just on "respect the std w/o questions nor reasoning and you'll be happy".If std would've been written by stupid,it would be the only option. (1) UDB should appear "naturally" as conseguence of something else. I want someone to write a possible "something else", not the banal fact that >>

ShinTakezou 2010-08-12 15:48:26

>> not the banal fact that a compiler written to be perfectly standard compliant, will be perfectly std compliant.

ShinTakezou 2010-08-12 15:50:28

@David I suspected all defs are in the std. But what's the point of SO if an OP have to dig into the std while there are people familiar with it that could have copy-pasted them altogether with the part explaining the UDB? - Where am I "dogmatic"? Std is heavely dogmatic and no problem with that - And I believed programmers disliked special cases, since they force the writing of more code ;; I've taken into account unspecification of order of exec (indeed the only thing I've taken into account), but it causes USB, not UDB;in fact in this ans I mix them (I'll fix once I get why it's UDB)

ShinTakezou 2010-08-12 15:57:08

(I'll fix once I get why it's UDB not dogmatically, I mean, once I have a clear explanation of why it is not simply USB... the explanation could be also it was a mistake that will be fixed in the next standard, of course... then I have to find the time to partecipate in the standard writing to avoid such monstruosities:D)

ShinTakezou 2010-08-12 15:58:47

@ShinTakezou - It depends. Either you're talking about what is *likely to be true* (in which case, as the accepted answer says, "The output is likely to be 2 in every reasonable case"), or you're talking about what can be *definitely asserted* if all you know is that the compiler is standard-compliant (and if the compiler is designed to help catch undefined behaviour, it would be very useful for it to print an error message.) Or you're talking about what the next version of the standard should say, which is yet another completely different issue.

Daniel Earwicker 2010-08-12 16:39:57

@Daniel all 3 things. The last, if the "others" are totally right, then the std should be fixed - self-justified existance of rules shows weakness, imo. If they are not right (it is not clear at all, even though you can think it is),the ans is "it is always 2". If compliant compilers are built so that by default they don't stop on UDB, it means they are able to produce a code; once that code is "produced", running it will give always 2 (this is my claim);if the compiler won't compile,the OP's teacher should pick another example to show the point(I daresay it is not teaching UDB...)

ShinTakezou 2010-08-13 07:14:00

@ShinTakezou - *once that code is "produced", running it will give always 2 (this is my claim)* — it is perfectly possible for a standards compliant C implementation to produce code for that expression that does not result in "2" for the second argument. Whether you know of one that does this or not is irrelevant, the question is whether it is possible, and it is.

detly 2010-08-13 08:36:52

@ShinTakezou - "running it will give always 2". I'm now convinced that you use the word "always" in a sufficiently unusual sense as to make further discussion unproductive.

Daniel Earwicker 2010-08-13 08:41:56

Answer 9

A:

Echoing codaddict the answer is 2.

printf will be called with argument 2 and it will print it.

If this code is put in a context like:

void do_something()
{
    int x=1;
    printf("%d",++x,x+1);
}

Then the behaviour of that function is completely and unambiguously defined. I'm not of course arguing that this is good or correct or that the value of x is determinable afterwards.

Daniel 2010-08-11 00:28:51

I agree with you and codaddict saying the output will be 2 all the time; but there is a lot you can read about it, there are people believing this is undefined behaviour and thus we can't say it is always 2... I am trying to understand if their claim is reasonable; currently, I can't see good explanations to say they are right and we are wrong (i.e. that we can't have arguments to say that it is always 2); if you have time, patience, and you want, you could try to give more solid arguments for your answer.

ShinTakezou 2010-08-13 08:01:24

ansaurus

tags:

views:

answers:

C programming: is this undefined behavior?

related questions