views:

470

answers:

3

I've been reading K & R's book on C, and found that pointer arithmetic in C allows access to one element beyond the end of an array. I know C allows to do almost anything with memory but I just don't understand, what is the purpose of this peculiarity?

+10  A: 

Often, it is useful to denote the "end" position, which is one past the actual allocation, so you can write code like:

 char * end = begin + size;
 foreach(char * curr = begin; curr < /* or != */ end ; ++curr) {
    /* do something in the loop */
 }

The C standard explicitly says that this element is a valid memory address, but dereferencing it would still not be a good idea.

Why does it have this guarantee? Let's say you had a machine with 2^16 bytes of memory, addresses 0000-FFFF, 16-bit pointers. Say you created a 16 byte array. Could the memory be allocated at FFF0?

There are 16 bytes free contiguously, but:

begin + size == FFF0 + 10 (16 in hex) == 10000

which wraps to 0000 because of the pointer size. Now the loop condition:

curr < end == FFF0 < 0000 == false

Instead of iterating over the array, the loop would do nothing. This would break a lot of code, so the C standard says that allocation isn't permissible.

Todd Gardner
A: 

you can go well beyond 1 past the array for example`

int main()
{
        char *string = "string";
        int i = 0;
        for(i=0; i< 10;i++)
        {
                printf("%c\n", string[i]);
        }
        return 0;
}

will print garbage after the end of the word string, whatever was sitting in memory before hand.

mog
It may print garbage, format your hard drive, or cause demons to fly out of your nose; such is the nature of Undefined Behaviour.
aib
Well, just reading from the memory location is unlikely to format your hard drive or cause demons to fly out of your nose. Writing to it, however...
Andrei Krotkov
Even reading a bad pointer may cause your program to crash in the future. See http://blogs.msdn.com/oldnewthing/archive/2006/09/27/773741.aspx
Eclipse
If it is a memory mapped device I/O port, you can certainly cause bad things to happen just by reading. But that shouldn't be possible outside the kernel, of course.
Lars Wirzenius
Or in an embedded system. My hardware design partner loves to build write-only and read-only ports, and indiscriminate accesses can cause the hardware to do unexpected things that typically only show a symptom at a demo.
RBerteig
+12  A: 

C doesn't allow access to memory beyond the end of the array. It does, however, allow a pointer to point at one element beyond the end of the array. The distinction is important.

Thus, this is OK:

char array[N];
char *p;
char *end;

for (p = array, end = array + N; p < end; ++p)
    do_something(p);

(Doing *end would be an error.)

And that shows the reason why this feature is useful: a pointer pointing at the (non-existent) element after the end of the array is useful for comparisons, such as in loops.

Technically speaking, that is everything the C standard allows. However, in practice, the C implementation (compiler and runtime) does not check whether you access memory beyond the end of the array, whether it is one element or more. There would have to be bounds checking and that would slow down program execution. The kinds of programs C is best suited for (systems programming, general purpose libraries) tend to benefit more from the speed than the security and safety bounds checking would give.

That means C is perhaps not a good tool for general purpose application programming.

Lars Wirzenius