I've been reading K & R's book on C, and found that pointer arithmetic in C allows access to one element beyond the end of an array. I know C allows to do almost anything with memory but I just don't understand, what is the purpose of this peculiarity?
Often, it is useful to denote the "end" position, which is one past the actual allocation, so you can write code like:
char * end = begin + size;
foreach(char * curr = begin; curr < /* or != */ end ; ++curr) {
/* do something in the loop */
}
The C standard explicitly says that this element is a valid memory address, but dereferencing it would still not be a good idea.
Why does it have this guarantee? Let's say you had a machine with 2^16 bytes of memory, addresses 0000-FFFF, 16-bit pointers. Say you created a 16 byte array. Could the memory be allocated at FFF0?
There are 16 bytes free contiguously, but:
begin + size == FFF0 + 10 (16 in hex) == 10000
which wraps to 0000 because of the pointer size. Now the loop condition:
curr < end == FFF0 < 0000 == false
Instead of iterating over the array, the loop would do nothing. This would break a lot of code, so the C standard says that allocation isn't permissible.
you can go well beyond 1 past the array for example`
int main()
{
char *string = "string";
int i = 0;
for(i=0; i< 10;i++)
{
printf("%c\n", string[i]);
}
return 0;
}
will print garbage after the end of the word string, whatever was sitting in memory before hand.
C doesn't allow access to memory beyond the end of the array. It does, however, allow a pointer to point at one element beyond the end of the array. The distinction is important.
Thus, this is OK:
char array[N];
char *p;
char *end;
for (p = array, end = array + N; p < end; ++p)
do_something(p);
(Doing *end
would be an error.)
And that shows the reason why this feature is useful: a pointer pointing at the (non-existent) element after the end of the array is useful for comparisons, such as in loops.
Technically speaking, that is everything the C standard allows. However, in practice, the C implementation (compiler and runtime) does not check whether you access memory beyond the end of the array, whether it is one element or more. There would have to be bounds checking and that would slow down program execution. The kinds of programs C is best suited for (systems programming, general purpose libraries) tend to benefit more from the speed than the security and safety bounds checking would give.
That means C is perhaps not a good tool for general purpose application programming.