ansaurus

Question

Rules for using the restrict keyword in C?

Answer 1

+1 A:

May be the optimisation done here don't rely on pointers not being aliased ? Unless you preload multiple mul2 element before writing result in res2, I don't see any aliasing problem.

In the first piece of code you show, it is quite clear what kind of aliases problem can occur. Here it is not so clear.

Rereading Dreppers article, he does not specifically says restrict might solve anything. There is even this phrase :

{In theory the restrict keyword introduced into the C language in the 1999 revision should solve the problem. Compilers have not caught up yet, though. The reason is mainly that too much incorrect code exists which would mislead the compiler and cause it to generate incorrect object code.}

In this code, optimisations of memory access has already been done within the algorithm. The residual optimisation seems to be done in the vectorized code presented in appendice. So for the code presented here, I guess there is no difference, because no optimisation relying on restrict is done. Every pointer access is a source of aliasing, but not every optimisation relies on aliassing.

Premature optimization being the root of all evil, the use of the restrict keyword should be limited to the case your are actively studying and optimizing, not used wherever it could be used.

shodanex 2010-01-05 11:46:52

@shodanex: This is exactly what I'm wondering. Drepper seems to indicate in his paper that specifically this code benefits from `restrict`. You can see a vectorized version of the same code using SSE2 instruction here: http://lwn.net/Articles/258188/ . Is it possible he just always uses `restrict` as his personal best practice and just didn't bother to check if it makes any difference with this code?

Robert S. Barnes 2010-01-05 11:59:11

in the code you mention, there still is the presence of for (j2 = 0; j2 < SM; j2 += 2) which is where I could see aliasing occur. but this is not easy to see

shodanex 2010-01-05 13:03:51

@shodanex: So why doesn't `rres[j2] += rmul1[k2] * rmul2[j2];` present an aliasing problem? Let's say I call `mm(_res, _res, _res)`. `rres[j2]` and `rmul2[j2]` have to be reloaded every time through the loop anyways. So even if ` potential aliasing doesn't matter.

Robert S. Barnes 2010-01-06 10:41:32

@shodanex: So basically, Drepper always sticks in `restrict` as a best practice ( considering he put it in the vectorized version of the code ), and just didn't bother to check whether or not it made any difference with this particular code. That's what it looks like to me anyways.

Robert S. Barnes 2010-01-06 10:43:35

Answer 2

+3 A:

It is a hint to the code optimizer. Using restrict ensures it that it can store a pointer variable in a CPU register and not have to flush an update of the pointer value to memory so that an alias is updated as well.

Whether or not it takes advantage of it depends heavily on implementation details of the optimizer and the CPU. Code optimizers already are heavily invested in detecting non-aliasing since it is such an important optimization. It should have no trouble detecting that in your code.

Hans Passant 2010-01-05 13:05:22

So basically you're saying that the alias analysis in gcc is so good that it's already able to detect that there are no aliasing problems on it's own even without the hint from using the `restrict` keyword?

Robert S. Barnes 2010-01-05 13:13:36

Right. You'd need to pass double** as arguments and update them to introduce a blatant alias that the optimizer can't rule out.

Hans Passant 2010-01-05 13:25:49

But without any knowledge of the potentiall caller, the result is the same, so I doubt this is what happens here

shodanex 2010-01-05 13:27:20

Answer 3

A:

Are you running on 32 or 64-bit Ubuntu? If 32-bit, then you need to add -march=core2 -mfpmath=sse (or whatever your processor architecture is), otherwise it doesn't use SSE. Secondly, in order to enable vectorization with GCC 4.2, you need to add the -ftree-vectorize option (as of 4.3 or 4.4 this is included as default in -O3). It might also be necessary to add -ffast-math (or another option providing relaxed floating point semantics) in order to allow the compiler to reorder floating point operations.

Also, add the -ftree-vectorizer-verbose=1 option to see whether it manages to vectorize the loop or not; that's an easy way to check the effect of adding the restrict keyword.

janneb 2010-01-05 13:55:56

I'm using 32 bit Ubuntu on a Pentium M with -march=native. I added fast-math. The code I posted doesn't use SSE. Using the suggested options doesn't seem to help. I still get identical binaries in both cases.

Robert S. Barnes 2010-01-06 09:57:54

Ah indeed, so it seems. I'm sure you also saw the reason for the failure to vectorize with the help of the -ftree-vectorizer-verbose= option. Which explains why Drepper had to resort to intrinsics for the vectorized version of his code. So you need to come up with another example.

janneb 2010-01-06 13:42:18

Answer 4

A:

Also, GCC 4.0.0-4.4 has a regression bug that causes the restrict keyword to be ignored. This bug was reported as fixed in 4.5 (I lost the bug number though).

Fredrik Berg Kjolstad 2010-08-24 02:23:33

Answer 5

A:

If there is a difference at all, moving mm to a seperate DSO (such that gcc can no longer know everything about the calling code) will be the way to demonstrate it.

Logan Capaldo 2010-08-24 02:30:20

ansaurus

tags:

views:

answers:

Rules for using the restrict keyword in C?

related questions