views:

647

answers:

8

Possible Duplicates:
What’s the worst example of undefined behaviour actually possible?
What are all the common undefined behaviour that c++ programmer should know about?
What are the common undefined behaviours that Java Programmers should know about

There are lots of parts of the C and C++ standards that are not completely specified, because to do so might exclude some oddball architecture. Relying on the behavior of your particular compiler and CPU is generally frowned upon, for good reason. On the other hand some behaviors are so universal that you could easily get away with relying on them, even if it is technically undefined.

For example, initializing a pointer to 0xDEADBEEF could lead to bad results on a processor that traps on invalid pointer addresses. But I've never used such a processor, and I'll bet that you haven't either. I'm sure plenty of people have caught a bug or two with this technique, without any winged monkeys taking flight from their nostrils.

On the other hand, if you had old Mac code that depended on the endianness of the processor, you probably lived to regret that decision.

I'd love to see some real world examples of both good and bad reliance on unspecified behavior. It will be interesting to see which type gets the most votes.

There was a similar question with an entirely different emphasis: What’s the worst example of undefined behaviour actually possible?. I'm asking for examples where you are relying on the undefined behavior.

+1  A: 

Identical char literals might point to the same memory and might be editable, if an edit changes one, both or isn't allowed is undefined

Martin Beckett
You have to go to some trouble to cast those to a non-const array, don't you? Has that ever happened to you?
Mark Ransom
They're not const-qualified in C (for compatibility with C programs written before `const` came along).
caf
In C++ string literals are arrays of `const char` but a string literal can be implicitly converted to a `char*` for compatibility with C.
Charles Bailey
@Mark, no trouble: s=strchr("abcde",'c'); *s='1';
Secure
+2  A: 

This is type of thing is reasonably common:

void free_and_null(void **p)
{
    free(*p);
    *p = NULL;
}

int *x = malloc(100);
free_and_null((void **)&x);

...and it can, at least in theory, break these days on compilers that optimise assuming strict interpretation of aliasing rules.

The absolute most common one is probably the old trick of retrieving the value of a union member that wasn't the one most recently stored to.

caf
Case one for the macro solution!
Chris Lutz
I do not see the problem here. Could you elaborate?
Thomas
Assumes that sizeof(int*)==sizeof(void*). If sizeof(void*)>sizeof(int*) you have bigger issues.
MSalters
It also accesses an object of type `int *` through an lvalue of type `void *`. That's simply not guaranteed to work.
caf
A: 
Alok
The problem isn't EOF causing an overrun - when `fgetc()` is returning EOF it doesn't go through the intermediate `unsigned int` conversion. The problem is if you have a char being read from the stream which when converted to an `unsigned char` has a value that can't be represented as an `int` - that unsigned character will be converted to an `int` when returned, resulting in an implementation-defined (not undefined) value, which I supposed could happen to match up with the EOF value.
Michael Burr
Hmm, makes much more sense than my understanding. Thanks!
Alok
A: 

How many times have I seen code like this:

void write(int fileID,int data)
{
    write(fileID,&data,4); // If they are on the ball 4 is sizeof(int)
}

int read(int fileID)
{
    int result;
    read(fileID,&result,4);
}

But no consideration to endianess.

Martin York
Would writing `sizeof(int)` instead of `4` remedey that?
knight666
No. using sizeof(int) solves the problem that int is not always 4. You need to use htonl() and family to get a specific endianes.
Martin York
@Martin: the interface you provided doesn't imply such a conversion would occur, so in general for such an interface I would expect the user to have already converted the value to network byte order, or else they don't care. In short all that statement says is write/read the amount of an int from/into this variable to/from this fd, which could be a socket, a pipe etc... a solution could be to just give it a better name.
Beh Tou Cheh
@Beh: The problem exactly: They don't care, because they don't realize it matters. Assuming a file means, requiring compatible hardware upgrades. Assuming a stream, requires a homogeneous set of computers. You are correct that the interface I provided does not imply this conversion but I was trying to make an small example, you can assume the context is a transport layer. The transport layer should deal with all the conversion require to get the data to the destination. Otherwise the coupling of business code to transport layer is high.
Martin York
@Beh: My main point being the incompatibility of hardware data representations. Writing code that depends on the underlying hardware representation of the data is going to work in the small scale (testing/Dev). But will fail when applied to real world situations.
Martin York
@Martin York: However, `sizeof(int)` introduces another possibility, that `sizeof(int)` will be different on the two platforms. (If the endianness can change, so can other facets of data representation.) Much better to specify a definite length and account for endianness, and handle issues where `sizeof(int) != 4` in the appropriate program.
David Thornley
A: 

I think that as a paradigm, thinking too much in terms of undefined and defined is a bad idea.

I see a lot of braindead C++ lawyering on here, and I think it causes a lot of the really bad thinking which causes a lot of the really bad programming that seems to be the norm.

Of course I'm not suggesting to do stuff that has no guaranteed result, but usually the train to undefined land has a first stop when you try to use all of the crazy features that rules lawyering will lead you to in the first place. Like using const improperly (almost all uses of it), complicated initializers in the constructor, exceptions (due to the implementation), etc.

You need to understand more the why than the what of everything or you are just doomed. Especially when the rules of C++ change all the time, generally making them much more complicated and buggy. I see all the time someone posts some mindbogglingly stupid answer because all they seem to know is the C++ rules, not how compilers have to work or how languages generally handle things or why things are the way they are.

Charles Eli Cheese
+2  A: 

There is the static initialization order problem. When specifying two static variables in different .cpp's, and one is initialized using the other, the initialization order is undefined. In my opinion, this renders the static data almost useless.

The workaround is to wrap the static value in a function, as follows:

Value value()
{
    static Value ret(otherStaticInstance);
    return ret;    
}
Dimitri C.
+1  A: 

When defining two functions with the same name in the .cpp only (in separate .cpp's), the compiler won't complain, but pick one of both implementations randomly. The workaround is to wrap the function in an unnamed namespace.

Dimitri C.
Do you mean the same .cpp file? Why would you do that?
configurator
@configurator: I mean in different .cpp's.
Dimitri C.
Wow, I had no idea that could happen. Sounds dangerous!
configurator
A: 

Overflow on signed integers is undefined.

But the following code, assuming both a and b are positive, just works everywhere :)

int a, b;
    /* maximize some value */
    if (a + b < a) { /* oops */ }
    else { maxval = a + b; }

The proper way is to test for overflow before causing the Undefined Behaviour

int a, b;
    /* maximize some value */
    if (a > INT_MAX - b) { /* oops */ }
    else { maxval = a + b; }
pmg
*Everywhere*? It won't even always work with gcc! Check out this documentation on `-fstrict-overflow`: http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#index-fstrict_002doverflow-789 - according to that, gcc can assume that the test in your `if` statement will never be true, and entirely optimise out that branch.
caf
There have been platforms that would throw some sort of exception for arithmetic overflow, although I'm not aware of any that are currently in widespread use. The reasoning (see the original C89 Rationale) was that mandating either flagging some sort of exception or wrapping around would force signed arithmetic to be inefficient on some platforms. The Committee was willing to require unsigned arithmetic to follow rules, as they thought maximum efficiency was less important there.
David Thornley