views:

141

answers:

5

I was just reading some code and found that the person was using arr[-2] to access the 2nd element before the arr, like so:

|a|b|c|d|e|f|g|
       ^------------ arr[0]
         ^---------- arr[1]
   ^---------------- arr[-2]

Is that allowed?

I know that arr[x] is the same as *(arr + x). So arr[-2] is *(arr - 2), which seems ok. What do you think?

Thanks, Boad Cydo.

+1  A: 

Sounds fine to me. It would be a rare case that you would legitimately need it however.

Matt Joiner
Lol, rampaging haters
Matt Joiner
+1 to counter stupid downvoting
slebetman
It's not *that* rare - it's very useful in e.g. image processing with neighbourhood operators.
Paul R
+13  A: 

That is correct. From C99 §6.5.2.1/2:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).

There's no magic. It's a 1-1 equivalence. As always when dereferencing a pointer (*), you need to be sure it's pointing to a valid address.

Matthew Flaschen
Whoever down-voted this, please explain why.
Matthew Flaschen
+1 to counter stupid downvoting.
slebetman
Note also that you don't have to dereference the pointer to get UB. Merely computing `somearray-2` is undefined unless the result is in the range from the start of `somearray` to 1 past its end.
RBerteig
In older books the `[]` were referenced as a *syntax sugar* for pointer arithmetic. *Favorite* way to confuse beginners is to write `1[arr]` - instead of `arr[1]` - and watch them guessing what that supposed to mean.
Dummy00001
What happens on 64 bit systems (LP64) when you have a 32 bit int index which is negative ? Should the index get promoted to a 64 bit signed int prior to the address calculation ?
Paul R
@Paul, from §6.5.6/8 (Additive operators), "When an expression that has integer type is added to or subtracted from a pointer, theresult has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression." So I think it will be promoted, and `((E1)+(E2))` will be a (64-bit) pointer with the expected value.
Matthew Flaschen
@Matthew: thanks for that - it sounds like it *should* work as one might reasonably expect.
Paul R
+9  A: 

This is only valid if arr is a pointer that points to the second element in an array or a later element. Otherwise, it is not valid, because you would be accessing memory outside the bounds of the array. So, for example, this would be wrong:

int arr[10];

int x = arr[-2]; // invalid; out of range

But this would be okay:

int arr[10];
int* p = &arr[2];

int x = p[-2]; // valid:  accesses arr[0]

It is, however, unusual to use a negative subscript.

James McNellis
I wouldn't go so far as to say it's invalid, just potentially messy
Matt Joiner
@Matt: The code in the first example yields undefined behavior.
James McNellis
BSTR is a good example in windows. Any debug allocator. Nothing wrong with it.
Hans Passant
It is invalid. By the C standard, it explicitly has undefined behavior. On the other hand, if `int arr[10];` were part of a structure with other elements before it, `arr[-2]` could potentially be well-defined, and you could determine if it is based on `offsetof`, etc.
R..
BSTR and debug allocators both allocate space and return a pointer somewhere inside that space. This property makes a negative offset "safe" for some values of "safe". The example James gives here of `arr[-2]` is explicitly undefined behavior because it attempts to access a location before the beginning of `arr` itself. Note that even computing `arr-2` without accessing the location is undefined behavior.
RBerteig
+1 for pointing out undefined behaviour
sleske
+2  A: 

What probably was that arr was pointing to the middle of the array, hence making arr[-2] pointing to something in the original array without going out of bounds.

Igor Zevaka
+2  A: 

I'm not sure how reliable this is, but I just read the following caveat about negative array indices on 64-bit systems (LP64 presumably): http://www.devx.com/tips/Tip/41349

The author seems to be saying that 32 bit int array indices with 64 bit addressing can result in bad address calculations unless the array index is explicitly promoted to 64 bits (e.g. via a ptrdiff_t cast). I have actually seen a bug of his nature with the PowerPC version of gcc 4.1.0, but I don't know if it's a compiler bug (i.e. should work according to C99 standard) or correct behaviour (i.e. index needs a cast to 64 bits for correct behaviour) ?

Paul R