Just to put it all together and so we can compare the different ideas that arose in the different answers. I'll comment on what i think about the stuff. Community wiki, because this is merely a collection of other people's thoughts :) All emphasis are put by me below.
First, we have to concern whether the pointer to one past the last element refers to an object. An array of bound N
has N
sub-objects that are its elements, as explained in 8.3.4/1
An object of array type contains a contiguously allocated non-empty set of N sub-objects of type T. - 8.3.4/1
To my knowledge, there is no mention in the Standard about an object located just after an array. If there is such an object, we are allowed to dereference the pointer that points one past the end, because of the following text and clarifying note
If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained. [Note: for instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address. ] - 3.9.2/3
This is meant to say that the following is well defined, if the implementation lays the objects in a way that the storage of b
is allocated directly behind the array object (which you can get manually if you overallocate some chunk of memory using malloc, assigning to a pointer to an array having a smaller size - i will keep it simple and only illustrate using the following example)
int a[3], b;
*(a + 3) = 0;
assert(b == 0 && (a + 3 == &b) && a[3] == 0);
Consent on a few people is that your shown expression, &array[5]
, is undefined behavior. This is based on the fact, which stands, that the Standard says at 3.10/2
and 5.3.1/1
An lvalue refers to an object or function. - 3.10/2
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. - 5.3.1/1
Above, we've seen that we are not guaranteed that there is an object (of the same type) after the last element of an array allocated. This should be kept different from another case, which happens when you have an object allocated (memory reserved), but that object has not started lifetime yet, as it happens if you allocate memory with malloc, and are going to placement-new an object into that area: Then you are allowed to dereference the area before you invoke the constructor, as long as you happen to keep some simple rules, like not trying to read a value out of the generated lvalue (3.8/5
and 3.8/6
)
The interesting thing is, what happens when the lvalue does not refer to an object? Remember that an lvalue has to refer to an object (or function).
The Standard draws this operation well-defined at 5.2.8/2
talking about the typeid
operator, which evaluates lvalue expression operands.
If the lvalue expression is obtained by applying the unary * operator to a pointer and the pointer is a null pointer value (4.10), the typeid expression throws the bad_typeid exception. - 5.3.1/1
This is contrary to 3.10/2
, which requires that an lvalue expression refers to an object/function, which a null pointer value does not refer to. At this point, we have got a defect in the Standard: One place allows to de-reference a null pointer in a way that contradicts another part of the Standard. This was observed long ago, and is being discussed in the linked issue report. As the one guy there notes, it's just handling dereferenced null special, to circumvent the lvalue-without-object problem. Since it starts out with talking about an lvalue, it's at least a problematic way for handling that currently.
The idea to generally handle this, is to introduce an empty lvalue that purposely refers to no object or function. If we try to read a value out of it, we get undefined behavior. As long as we don't, we do not. Dereferencing a past-the-end address could yield such an empty lvalue, as we can't be sure usually whether there is an object located or not.
However, as the discussions on that report indicates, there are still outstanding issues (like, what happens with our overallocating case?) before it can be incorporated into the Standard.
Conclusion
I believe there is neither a right nor a wrong way about it. While i have the slight tendency to view this as generally undefined behavior, because there is no lvalue that doesn't refer to an object, i also see the current quite problematic way of typeid
handling with this problem. Since this concerns an active issue in the Standard, the best you could do is doing an addition to get the pointer value, instead of dereferencing past-the-end, thus avoiding the problem altogether.
Note that all the above is no problem in C. C makes it all well-formed by saying &*
is next to a no-op but just making a pointer into an rvalue, thus you can't do
(&*a) = NULL;
The same simple thing, sadly, isn't true about C++, though.