tags:

views:

126

answers:

4

What type for array index in C99 should be used? It have to work on LP32, ILP32, ILP64, LP64, LLP64 and more. It doesn't have to be a C89 type.

There are 5 candidates:

  • size_t
  • ptrdiff_t
  • intptr_t / uintptr_t
  • int_fast*_t / uint_fast*_t
  • int_least*_t / uint_least*_t

There is simple code to better illustrate problem. What is the best type for i and j in these two particular loops. If there is a good reason, two different types are fine too.

for (i=0;i<imax;i++) {
        do_something(a[i]);
}
/* imin can be less than 0 */
for (j=jmin;j<jmax;j++) {
        do_something(a[j]);
}

P.S. In the first version of question I had forgotten about negative indexes.

P.P.S. I am not going to write a C99 compiler. However any answer from a compiler programmer would be very valuable for me.

Possible duplicates:

+7  A: 

Since the type of sizeof(array) (and malloc's argument) is size_t, and the array can't hold more elements than its size, it follows that size_t can be used for the array's index.

EDIT This analysis is for 0-based arrays, which is the common case. ptrdiff_t will work in any case, but it's a little strange for an index variable to have a pointer-difference type.

Amnon
+4  A: 

I think you should use ptrdiff_t for the following reasons

  • Indices can be negative (thus all unsigned types, including size_t, are out of question)
  • The type of p2 - p1 is ptrdiff_t. The type of i in the reverse thing, *(p1 + i), should be that type too (notice that *(p + i) is equivalent to p[i])
Johannes Schaub - litb
What do you mean with "indices can be negative"? Not when actually indexing, surely?
unwind
@unwind, sure why not? `int a[10]; int *pa = a+1; pa[-1] = 0;`. Array indexing is nothing but pointer arithmetic, and C doesn't care about the value you give. Using an unsigned index type will fail for many completely legal index operations.
Johannes Schaub - litb
@unwind - Depends on whether you're doing something weird. For instance, I have some arrays in my project that are indexed relative to some zero point - zero represents "now", positive indices represent future times, and negative indices are for past times.
kwatford
It's also useful for having a sentinel value below zero. But really, the usecase is irrelevant if the questioner aims for a type that will work for any and all scenarios. What's important is really that unsigned types are the wrong choice.
Johannes Schaub - litb
@Johannes: Yeah, should have been clearer about that, I guess. I think I consider it a "special case", where you've done some work to make sure that negative indices are okay, by moving the 0-point of the array. Fine.
unwind
@unwind since array indexing in fact is just pointer indexing in the end, if you choose a unsigned index type that is used for "array" indexing, you also choose it for applying on pointers (it's always applied to pointers anyway, because of the implicit conversion). Having a negative index for pointers is quite common. It depends on what exactly the questioner looks for. His question first sounds like he wants to be really generic about it (like writing a C99 compiler) but then it drives into a direction where he is only interested in making a single loop working. I don't know.
Johannes Schaub - litb
A: 

If you know the maximum length of your array in advance you can use

  • int_fast*_t / uint_fast*_t
  • int_least*_t / uint_least*_t

In all other cases i would recommend using

  • size_t

or

  • ptrdiff_t

depending on weather you want to allow negative indexes.

Using

  • intptr_t / uintptr_t

would be also safe, but have a bit different semantics.

codymanix
@codymaxin Could You write something more about these bit different semantics?
Michas
intptr_t is an integer which has at least the size of a pointer so you can safely cast a pointer into intptr_t. Think of it as a numerical representation of a pointer.
codymanix
`int_least*_t` should never be used for a single variable. It may be a slow-to-access type, and is intended only to be used in arrays where you need to save space but guarantee a certain minimum number of bits. On any sane platform, you could just request the exact size you need (8, 16, 32, or 64) but C99 allows implementations that have no type of a certain size, and thus `int_least*_t` exists to request the "next largest type".
R..
+1  A: 

I almost always use size_t for array indices/loop counters. Sure there are some special instances where you may want signed offsets, but in general using a signed type has a lot of problems:

The biggest risk is that if you're passed a huge size/offset by a caller treating things as unsigned (or if you read it from a wrongly-trusted file), you may interpret it as a negative number and fail to catch that it's out of bounds. For instance if (offset<size) array[offset]=foo; else error(); will write somewhere it shouldn't.

Another problem is the possibility of undefined behavior with signed integer overflow. Whether you use unsigned or signed arithmetic, there are overflow issues to be aware of and check for, but personally I find the unsigned behavior a lot easier to deal with.

Yet another reason to use unsigned arithmetic (in general) - sometimes I'm using indices as offsets into a bit array and I want to use %8 and /8 or %32 and /32. With signed types, these will be actual division operations. With unsigned, the expected bitwise-and/bitshift operations can be generated.

R..