views:

393

answers:

8

In my code I have effectively the following:

wchar_t* buffer = new wchar_t[size];
// bonus irrelevant code here
delete[] reinterpret_cast<char*>( buffer );

Types in question are all built-in and so they have trivial destructors. In VC++ the code above works allright - new[] just allocates memory, then delete[] just frees it.

Is it acceptable in C++? Is it undefined behaviour?

A: 

Yes, but why do the reinterpret_cast?

Liz Albin
There're many layers of function calls that I concisely called "irrelevant code" that lead to the pointer being passed here, then there, casting many times along the way.
sharptooth
Ah, I had presumed that the new[] and delete[] were in the same scope,
Liz Albin
+5  A: 

It is undefined behaviour because delete[] invokes the wrong destructor. However, wchar_t and char are PODs, so they have no dedicated destructor and all delete[] does is calling the heap implementation to free up the pointer. Therefore, it is most likely to work, no byte is lost. But strictly speaking it is still undefined.

Alexander Gessler
A: 

The delete[] operator internally uses a loop of some form to destruct the elements of your array. If the elements are different objects a different destructor will be used -- which can potentially cause undefined behaviour. Since the is a wchar and a char -- primitive types -- it probably won't cause any undesireable behaviour.

WARNING: IF YOU CONTINUE READING DO SO AT YOUR OWN PERIL !! GROSS DESCRIPTIONS OF UNDEFINED BEHAVIOUR AHEAD. THIS IS FOR EDUCATIONAL PURPOSES ONLY.

Example 1:

If you had two objects which were the same size and all their destructor did was zero out the memory then again it would probably not cause undersireable behaviour.

Example 2:

However if you had two objects where one type encapsulated a single 4 byte handle to a resource and the other had two such elements and you casted an array of the later into the singular case -- well then you would leak half of the handles of your array. The situation would look as follows:

..2:[1|2][1|2]FREE..

where the '2:' represents the size of the array. After a downcast the compiler will generate a delete that perceived the data as so :

..2:[1][1]FREE...

therefore after the free things would look like so:

..FREE[1|2]FREE..

Hassan Syed
I had the impression that undefined behavior *is* undefined behavior. This may include doing the desired thing, but it's outside what the language guarantees regardless. (E.g `++i + i++` is undefined, regardless of whether it produces the result that you expect or not.)
UncleBens
Ofcourse, I just put it out there for discussion. The OP mentioned a corner case... I've just presented a discussion that extends the scope of this corner case. It might be useful to someone. I've changed 'undefined' to 'undesirable'
Hassan Syed
@Hassan: Yes, but "probably not cause undefined behavior" is meaningless. Something like "probably not do anything bad" is reasonable, but if there is a real possibility of undefined behavior than the behavior is simply undefined. Having seen a lot of misunderstanding of "undefined behavior", I tend to get real picky about it.
David Thornley
there you go :D now there is no room for confusion.
Hassan Syed
+13  A: 

My initial thought was that it is undefined behavior.

5.3.5/3: "In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73).

Footnote 73 reads, "This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void".

Arguably the object in your example doesn't have a dynamic type, since the definition of "dynamic type" at 1.3.3 mentions "most derived object", and the definition of "most derived object" at 1.8/4 is talking about objects of class type. So I kept looking:

5.2.10/3: "[reinterpret_cast] might, or might not, produce a representation different from the original value"

5.3.5/2: "The value of the operand of delete shall be the pointer value which resulted from a previous array new-expression".

I'm not sure whether a reinterpret_cast results in the same pointer value as was input, or not. Possibly it's cleared up by some other bit of the standard which I haven't found yet. I would not call this code "OK" without finding something to definitively state that if you reinterpret_cast a pointer, the result is the same "pointer value" as before, so that by passing it to delete[] you are passing "the pointer value" from new[].

5.2.10/7: "Except that casting [between certain pointer types] and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified".

This looks like bad news to me - it conspicuously doesn't say that the cast yields the same value, only that the pair of casts over and back, yields the same value. This suggests to me that the single cast is allowed to yield a different value, but it is only suggestive, not explicit. This is the usual problem with the rule that "if the standard doesn't state the behavior, then the behavior is undefined". Just because it doesn't state it in any of the paragraphs I can find using the index, doesn't mean it doesn't state it somewhere else...

We know that in practice we can cast things to unsigned char* in order to inspect their bytes, or void* to copy PODs using memcpy, so there must be some casts guaranteed to create aliases. You might think that if your implementation does create aliases with certain casts, then you're passing in the "same value" you got from new[]. But I'm still not sure that's good enough for delete[]. I think I'm missing something important.

Steve Jessop
You beat me to it by about 25 seconds! In my defense, however, I had you beat, until the network decided to have a hiccup and discard my first attempt at posting the answer. :-)
Jerry Coffin
Isn't that referring to derived types? A char * doesn't really have a "dynamic type".
anon
@Jerry: I only posted once the same hiccup ended for me, so pure fluke. @Neil: yes, I was just thinking the same and editing to that effect. I'm not yet sure whether this means the behavior is defined or not: if you reinterpret_cast a pointer, does the resulting pointer "have the same value", or not?
Steve Jessop
Now you can see why I said "really not sure" in my comment in the question :-)
anon
Yep, we need a bigger standards lawyer.
Steve Jessop
*I would not call this code "OK" without finding something to definitively state that if you reinterpret_cast a pointer, the result is the same "pointer value" as before* --- IIRC at least the C standard explicitely left room for pointers being different-sized depending on the pointed-to type (for platforms that had an int-sized address for `int *`, but introduced an additional offset field for `char *` or something like that) - no idea if this applies to C++ at all, though.
peterchen
@Steve Time to light the LITB-signal!
anon
Yep, him or Pavel.
Steve Jessop
I do appreciate the detail in this response, but I think the right answer is that if it takes a Standard lawyer to prove that some behavior is indeed defined I want to avoid it anyway.
David Thornley
I don't disagree. I think there are two issues here: (1) reinterpret_cast, which isn't really an issue anywhere we actually program for, and (2) how delete[] gets the size of the array, which actually could be a problem on "important" implementations if the standard doesn't define this to work. I would always avoid writing code where there's doubt if it's valid or not. If it turns out that a short quote from the standard proves it's valid, and I just failed to find the quote, then I think the code's OK with a comment, esp. if a rewrite would be painful. Still would advise to avoid it, though.
Steve Jessop
As I read 1.3.3 [defns.dynamic.type], the dynamic type is just the "original" type of the object. The "most derived" thing is only mentioned to avoid the ambiguity for objects that can be said to have multiple types due to derivation. But I'd say the dynamic type of something originally allocated as a wchar_t array is always wchar_t array. Which would make the static type different, and so it'd be UB. But it *is* pretty vague.
jalf
+2  A: 

At least as I'd read it, you have a static type (the type of the pointer) that differs from the dynamic type (the real type of the object it points at). That being the case, the second sentence of §5.3.5/3 applies:

In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.

Edit: Since what you apparently want is to allocate a buffer of "raw" memory instead of an array of objects, I'd advise using ::operator new instead of new[]. In this case, what you're doing is clearly defined, and also gives the reader a clear indication of intent.

Jerry Coffin
+1 for recommending `::operator new`.
MSN
+3  A: 

iso14882 section 5.2.10.3:

The mapping performed by reinterpret_cast is is implementation defined

iso14882 section 5.3.5.2:

The value of the operand of delete[] shall be the pointer value which resulted from a previous array new-expression

In other words, it's implementation defined whether or not the delete[] invokes undefined behaviour. Steer clear.

Joe Gauterin
A: 

Why are you using reinterpret_cast<..>? If you are writing something in pure C++, then you don't need reinterpret cast. In your case you are not allocating the memory for a object. You are allocating the memory for wchar_t. Why don't use to string instead of array of wchar_t?

Yogish Baliga
+1  A: 

Since wchar_t and char are both built-in types, the correct deallocation function (void operator delete(void* ptr)) would be called, and there is no destructor to call.

However the C++ 03 standard says the result of reinterpret_cast<T1*>(T2*) is undefined (section 5.2.10.7):

A pointer to an object can be explicitly converted to a pointer to an object of different type. Except that converting an rvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type yields the original pointer value, the result of such a pointer conversion is unspecified.

From a practical POV I can't imagine an implementation where a wchar_t* value is not a valid char* value, so your code should be OK on all the platforms. Just not standard-compliant...

Dan Berindei