views:

497

answers:

6

If I got the C99 restrict keyword right, qualifying a pointer with it is a promise made that the data it references won't be modified behind the compiler's back through aliasing.

By contrast, the way I understand the const qualifier is as compiler-enforced documentation that a given object won't be modified behind the back of a human being writing code. The compiler might get a hint as a side effect, but as a programmer I don't really care.

In a similar way, would it be appropriate to consider a restrict qualifier in a function prototype as a requirement that the user ensures exclusive access ("avoid aliasing", or maybe something stronger) for the duration of the call? Should it be used as "documentation"?

Also, is there something to understand in the fact that restrict qualifies a pointer rather than the data it points to (as const does) ?

EDIT: I originally believed that restrict could have implications with threaded code, but this seems wrong so I remove references to threads from the question to avoid confusing readers.

+1  A: 

Someone more familiar with the standard could probably give a better answer, but I'll give it a shot.

"Data won't be modified behind the compiler's back" sounds more like the opposite of "volatile" to me.

"const" means the data won't be modified in front of the programmer; that is, she can't modify the data through the signifier marked as "const" (I write "signifier" because in int const *pi, the name pi isn't const, but *pi is). The data might be modifiable via another signifier (non-const data can be passed to a function as const data, after all).

That "restrict" qualifies pointers is the key. Pointers are the only way to alias data in C, so they're the only way that you can access some piece of data via two different names. "restrict" is all about limiting data access to one access path.

outis
+1  A: 

This might be an example from an extremely narrow domain, but Altera's Nios II platform is a soft-core microcontroller that you can customize within an FPGA. Then, within the C source code for that micro, you can use a C-to-hardware tool to speed up inner loops using custom hardware, rather than in software.

In there, use of the __restrict__ keyword (which is the same as C99's restrict) allows the C2H tool to properly optimize the hardware acceleration of pointer operation in parallel instead of sequentially. At least in this case, the restrict is simply not meant for human consumption. See also Sun's page on restrict, where the first line says

Using the restrict qualifier appropriately in C programs may allow the compiler to produce significantly faster executables.

If anyone's interested in reading more on C2H, this PDF discusses optimizing C2H results. The section on __restrict__ is on page 20.

Mark Rushakoff
+8  A: 

The best 'intuition' to have about the restrict keyword is that its a guarantee (by the programmer to the compiler) that, for the lifetime of the pointer, memory accessed via that pointer will ONLY be accessed via that pointer and not via another other pointer or reference or global address. So its important that its on a pointer as its a property of both the pointer and the memory, tying the two together until the pointer goes out of scope.

Chris Dodd
+2  A: 

Your understanding is largely correct. The restrict qualifier simply states that the data accessed by a so-qualified pointer is only accessed by that exact pointer. It applies to reads as wells as writes.

The compiler doesn't care about concurrent threads, it wasn't going to generate code any differently, and you may clobber your own data as you like. But it does need to know what pointer operations may change what global memory.

Restrict also carries with it an API warning to humans that a given function is implemented with the assumption of unaliased parameters.

No locking by the user is necessary as far as the compiler is concerned. It only wants to make sure it correctly reads data that was supposed to be clobbered, by code the compiler was supposed to generate, in case there is no restrict qualifier. Adding restrict frees it from that concern.

Finally, note that the compiler is likely to already be analyzing possible aliasing based on data types, at the higher optimization levels, so restrict is important mostly for functions with multiple pointers to the same type of data. You can take a lesson from this subject and make sure that any deliberate aliasing you do is done via a union.

We can see restrict in action:

void move(int *a, int *b) {     void move(int *__restrict a, int *__restrict b) {
    a[0] = b[0];                    a[0] = b[0];
    a[1] = b[0];                    a[1] = b[0];
}                               }
    movl    (%edx), %eax            movl    (%edx), %edx
    movl    %eax, (%ecx)            movl    %edx, (%eax)
    movl    (%edx), %eax            movl    %edx, 4(%eax)
    movl    %eax, 4(%ecx)

In the right column, with restrict, the compiler did not need to reread b[0] from memory . It was able to read b[0] and keep it in register %edx, and then just store the register twice to memory. In the left column, it didn't know if the store to a may have changed b.

DigitalRoss
const does not imply restrict.
Crashworks
Of course not, but if it's known to the compiler to be `const` then presumably the compiler doesn't have to worry about scheduling stores to it.
DigitalRoss
It needn't worry about scheduling stores, but the pointer may still be aliased by something else, and so the compiler still has to reread through the const pointer any time any other pointer is stored through.
Crashworks
+9  A: 

Chris Dodd has the correct description of the keyword. In certain platforms it can be very important for performance reasons, because it lets the compiler know that once it has loaded data through that pointer onto a register, it need not do so again. Without this guarantee, the compiler must reload the data through a pointer every time any other possibly-aliasing pointer is written through, which can cause a serious pipeline stall called a load-hit-store.

const and restrict are different concepts, and it is not the case that const implies restrict. All const says is that you will not write through that pointer within the scope of that function. A const pointer may still be aliased. For example consider:

int foo( const int *a, int * b )
{
   *b *= 2;
   return *a + *b; // induces LHS: *a must be read back immediately
                   // after write has cleared the store queue
}

While you cannot directly write to a in this function, it would perfectly legal for you to call foo like:

int x = 3;
foo( &x, &x );  // returns 12

restrict is a different guarantee: a promise that a != b in all calls to foo().

I've written about the restrict keyword and its performance implications at length, and so has Mike Acton. Although we talk about a specific in-order PowerPC, the load-hit-store problem exists on the x86 as well, but the x86's out-of-order execution makes that stall harder to isolate in a profile.

And just to emphasize: this is not an arcane or premature optimization, if you care about performance at all. restrict can lead to really significant speedups if used correctly.

Crashworks
+1 for code exemple :-)
yves Baumes
I don't think the example necessarily results in `*b` being read back immediately (gcc with -O3 does what you'd want it to: reads a and b off the stack into edx and ecx respectively, then does `mov (%ecx),%eax`, `add %eax,%eax`, `mov %eax,(%exc)`, `mov (%edx),%ecx`, `pop %ebp`, `add %ecx,%eax`, `ret`). If `*a` were also modified in the function of course that would be a different story.
Steve Jessop
mov %eax,(%exc), mov (%edx),%ecx is the load-hit-store right there. I did mean *a instead of *b though.
Crashworks
So that stalls even if ecx and edx have different values? I didn't know that. Adding restrict doesn't change the output from gcc 3 at all, it doesn't even load from edx/a any earlier. So it seems that with that compiler, the LHS is inevitable no matter what you do.
Steve Jessop
It stalls (inside the CPU pipeline) if edx and ecx happen to be the same value. It doesn't if the one load doesn't hit the other store. That's why LHS are insidious: you cannot see them from looking at the code inside a function; the problem comes in the aliasing of the parameters you were passed. What restrict *should* do in this case is mov (%ecx),%eax ; mov (%edx),%ebx ; add %ebx, %ebx ; add %eax, %ebx ; ret . And yeah, GCC does emit more than its share of WTFs.
Crashworks
The more serious example of the problem would be if I wrote eg `*a += 3` before the `*b *= 2`; in that case a load would inevitably have to hit a store *even if the pointers did not alias* because the compiler would have to load back *a after touching *b; and *a had just been written.
Crashworks
Oh, that makes more sense. Sure, if they're the same value there's a stall, but then if they're the same value then you couldn't use restrict either. I confess I was a bit surprised that the load from edx wasn't started sooner with restrict, but I try to assume that x86 pipelining is Much Smarter Than Me nowadays. So when the values are different, I'd be surprised if there's a full load stall (waiting for eax) at `add eax,eax`, and then another one (for ecx) at `add ecx,eax` that doesn't even start until the previous one has finished. But maybe I'm too optimistic.
Steve Jessop
Compilers and pipelines are really much less clever than people seem to assume. In particular, if you load from an address that has a pending store, the CPU will usually have to throw out the whole pipeline and start over when the store has finished. Some x86s are better at this because of store forwarding, but an in-order PowerPC is definitely not; it just stops dead until the write bounces off the cache. And there's not many instructions that the x86 could reorder to happen while it's waiting for the read: everything afterwards is dependent on it!
Crashworks
+2  A: 

Most of what you know is wrong!

const does not guarantee that something won't change behind the compiler's back. All it does is stop you from writing to that spot. Something else might still be able to write to that location though, so the compiler canNOT assume it's constant.

As others have said, the restrict qualifier is about aliasing. In fact, during the first round of C standardization, there was a proposal for a "noalias" keyword. Unfortunately, the proposal was fairly poorly written -- it prompted the one and only time the Dennis Ritchie got involved during that process, when he wrote a letter that said something to the effect that "noalias must go. This is not open to negotiation."

Needless to say, 'noalias' didn't become part of C. When it came time to try again, the proposal was written enough better that restrict was included in the standard -- and even though noalias would probably have been a more meaningful name for it, that name was so tainted that I doubt anybody even considered trying to use it.

In any case, the primary intent of restrict is to tell the compiler that there will not be an alias of this item. One reason for this is to allow things to be stored in registers temporarily. For example, consider something like:

void f(int *a, int *b, int *c) { 
    for (int i=0; i<*a; i++)
        *b += c[i];
}

The compiler really wants to put i in a register, and load *a into a register, so when it comes time to decide whether to execute another iteration of the loop, it just compares the values in those to registers to each other. Unfortunately, it can't do that -- if somebody who used this function was completely insane, and called it with a==b, every time it writes to *b inside the loop, that new value is also the value of *a -- so it has to read *a from memory at every iteration of the loop, just in case whoever called it was completely insane. Using restrict tells the compiler it can generate code assuming that a and b will always be distinct, so writing to *a will never change *b (or vice versa).

Jerry Coffin
actually, values are not allowed to change behind the compilers back at all, whether `const` is present or not; if values do change, you'll have to add the `volatile` qualifier
Christoph
Yes and no -- volatile means you want the compiler to generate code that takes account of the possibility that it'll change behind your back.
Jerry Coffin