views:

274

answers:

3

I am doing lots of matrix arithmetic and would like to take advantage of C99's restrict pointer qualifier.

I'd like to setup my matrices as pointers to pointers to allow for easy subscripting, like so:

int **A = malloc (ncols * sizeof(int *));
A[0] = malloc (nrows * ncols * sizof(int));
for (int i=1; i < ncols; i++) {
    A[i] = A[0] + i*nrows;
}

Now, for a matrix multiplication function

void mmultiply ( int nrows, int ncols, int **Out, int **A, int **B);

must I qualify both pointers of the arguments as restricted? It's valid syntax, but I'm having a hard time determining if int *restrict *restrict behaves any differently than int **restrict.

Then, with the pointers properly restricted, is accessing elements through A[0][col*nrows + row] undefined? (ie, will the compiler assume that I only access the matrix through A[col][row] for values of row such that row < nrow)? Or must I simply remain consistent?

+2  A: 

The outer (second) restrict tells the compiler that none of the arrays of pointers (A, B, and out) alias. The inner (first) restrict tells the compiler that none of the arrays of ints (pointed to by elements of the arrays of pointers) alias.

If you access both A[0][col*nrows + row] and A[col][row] then you're violating the inner restrict, so things might break.

Chris Dodd
+2  A: 

For the first question, "yes", it will mean something different if you use both restrict qualifiers, specifically, that the pointers also won't be aliased. As to whether it makes any difference: theoretically yes, in practice, it depends on the optimizer.

For the second question, "yes", it will assume that anything accessed through a row pointer is only accessed through the row pointer.

You could throw const in there too.

Finally, if this is gcc at -O2, -O3, or -Os, the compiler is already doing an alias analysis based on types. I'm sure other compilers do this also. This means that restricting the pointers vs the ints is already understood, leaving only the arrays that could possibly store to each other.

In sum, the optimizer will assume that the pointers aren't being stored into as ints, and it knows it isn't doing any pointer writes during the loop.

So you will probably get the same code with only the one restrict.

DigitalRoss
Something to be careful of is that most compilers are much more lax than GCCx86 about aliasing. MSVC05, for example, will even consider an `int *` and a `float *` to possibly alias, unless you use restrict to tell it otherwise. (yes, I know this isn't per the spec, but the compiler does it anyway, as I learned from looking at /FAcs.)
Crashworks
+2  A: 

int **restrict only asserts that the memory addressed by Out, A and B don't overlap (except that A and B can overlap, assuming your function doesn't modify either of them). This means the arrays of pointers. It doesn't assert anything about the contents of the memory pointed to by Out, A, and B. Footnote 117 in n1124 says:

if identifier p has type (int **restrict), then the pointer expressions p and p+1 are based on the restricted pointer object designated by p, but the pointer expressions *p and p[1] are not.

By analogy with const, I suspect that qualifying with restrict twice will assert what you want, which is that none of the values in the array points to overlapping memory. But reading the standard, I can't prove to myself that it actually does. I reckon that "Let D be a declaration of an ordinary identifier that provides a means of designating an object P as a restrict-qualified pointer to type T" does indeed mean that for int *restrict *restrict A, then A[0] and A[1] are objects designated as a restrict-qualified pointer to int. But it's pretty heavy legalese.

I have no idea whether your compiler will actually do anything with that knowledge, mind you. Clearly it could, it's a question of whether it's implemented.

So I don't really know what you've gained over a conventional C 2-D array, where you just allocate rows * cols * sizeof(int), and index with A[cols*row + col]. Then you clearly only need one use of restrict, and any compiler that does anything with restrict will be able to re-order reads from A and B across writes to Out. Without restrict, of course, it can't, so by doing what you're doing, you're throwing yourself on your compiler's mercy. If it can't cope with double-restrict, only the single restrict case, then your double-indirection has cost you the optimization.

At first guess, multiplication is likely to be faster than an additional pointer indirection anyway. You obviously care about performance or you wouldn't be using restrict at all, so I'd test performance fairly carefully (on all compilers you care about) before making this change for the sake of slightly nicer syntax and not having to remember how many columns there are in your array every time you access it.

is accessing elements through A[0][col*nrows + row] undefined?

Yes, if the element is modified by one of the accesses, because this makes A[0] an alias for memory also accessed via A[col]. That'd be fine if only A and B were restrict-qualified pointers, but not if A[0] and A[col] are.

I assume that you don't modify A in this function, so actually that alias is fine. If you did the same thing with Out, though, behavior would be undefined.

Steve Jessop
Thank you for the thoughtful response. You are right about the multiplication being faster than the pointer indirection; I had already tested that to be generally true. However, I'm planning on bundling the dimensions with the data into a matrix struct to pass around. Not surprisingly, dereferencing `nrows` from the struct pointer and multiplying costs the same as dereferencing the data pointers. In functions that warrant it, I can dereference whichever pointers in advance and use 1-D array indexing. And if I'm doing that anyways, I may as well provide the nicer [col][row] syntax elsewhere.
Matt B.