views:

93

answers:

2

Let's say I declare this:

int [a][b][c];

Does a stand for the level/page, b the row and c the column?

Or can it be anything I want it to be i.e. a stands for column, b stands for page, and c stands for row (since data is just data and a cube can be abstracted any way)?

+8  A: 

Anything you want in any order you want, the designations row and column are just convention. The layout in memory will be http://en.wikipedia.org/wiki/Row-major_order in C++, That part can't be changed.

How you iterate through the elements will have an impact on performance in many cases. You want to change the rightmost index in your innermost loop to move through memory sequentially.

John Knoeller
+1. nicely said.
Mitch Wheat
A code sample of do-s and don't-s would be nice.
Hamish Grubijan
I don't think the layout in memory looks like that for C/C++ multidimensional arrays. The first dimension is an array of pointers, each of which points to a one-dimensional array; there's no contiguous memory layout you can depend on.
Carl Norum
Carl, no pointers are involved, only arrays. Array elements are stored contiguously.
avakar
You still want to iterate like 000,001,002 .. and not like 000, 100, 200, 300 ... if that makes sense.
Hamish Grubijan
@Carl: that's not true. C and C++ both let you define multidimensional arrays as contiguous memory. The restriction is that when passing them as parameters, all but the last dimension (so 2 out of 3 in this example) are part of the type, and hence only the last one can be variable at runtime.
Steve Jessop
@Hamish: yes, that's what I meant by rightmost index in your innermost loop.
John Knoeller
@Steve, is that really guaranteed somewhere? I'm reading the spec, and section 6.5.2.1 paragraph 4 seems to be saying the opposite of what you're saying.
Carl Norum
No way is this stuff guaranteed to be laid out contiguously. Each foo[bar] is really going a *(foo+bar). So you are dereferencing a pointer at each dimension. So you can't be at all sure that the elements in [0][0][0] and [0][1][0] bear any specific relation to each other.
Southern Hospitality
C99:6.5.2.1/4 says: Consider [...] `int x[3][5];`. Here `x` is a 3×5 array of ints; more precisely, `x` is an array of three element objects, each of which is an array of five ints.
avakar
@Southern Hospitality: I never said it's guaranteed to be contiguous, only that the right most index will be most compact in memory. I've never actually seen a compiler where it _isn't_ perfectly contiguous, however, and I doubt that you have either.
John Knoeller
Southern, if elements of `foo` weren't laid out contiguously, what would `foo+bar` mean? For `int a[4][4][4]`, `a[0][0]` and `a[0][1]` are laid next to each other exactly *because* `a[0][0] == *(a[0] + 0)` and `a[0][1] == *(a[0] + 1)`. It doesn't matter that `a[0][0]` is of type `int[4]`.
avakar
@Southern: You shoudn't be downvoting my answer because you have an issue with avakar's comments.
John Knoeller
Southern Hospitality, row-major order *is guaranteed*. 6.5.2.1.3 says "It follows from this that arrays are stored in row-major order (last subscript varies fastest)." Arrays convert to pointers to their first element. They *are not* pointers.
Matthew Flaschen
Obviously it depends what you declare, but it's pretty easy to test. Define an array `int x[3][5]` as in the example in the C standard, and print out the addresses of `x`, `x[1]`, `x[2]`, `x[1][1]` etc. The standard perhaps isn't as clear as it might be - when it says "successive subscript operators..." in para. 3, it's defining an *alternative* to para. 2 for multi-dimensional arrays. If para. 3 just meant the same as para. 2, then it would be redundant. So if `x` is an `int***`, then `x[1][1][1]` means a completely different thing (para 2) than if `x` is a multi-dimensional array (para 3).
Steve Jessop
@John: arrays are contiguous (6.2.5/20), and a multi-dimensional array is an array of arrays ("These methods of constructing derived types can be applied recursively"). It is guaranteed, so although you didn't say it was, you could have.
Steve Jessop
@Carl: oh, and also checkout `sizeof(x)`. No space for any pointers. Possibly what you're missing in 6.5.2.1/4 is quite how many times conversion is happening between array and pointer. It says "indirection is applied to yield an array". This 'indirection' doesn't involve actually following a pointer, it's purely a type conversion, which is "undone" as soon as the next index is considered and the array decays back to a pointer. On a more careful reading, para 3 doesn't *contradict* para 2, it expands on it in the special case.
Steve Jessop
Knoeller:I'm not downvoting you because of avakar's comments. I'm downvoting you because you explicitly state the memory organization of the array and the standard doesn't support you, it is undefined.I don't buy Jessop's answer either, recursive or not, the standard doesn't specify the padding at the end of an array. So even if you think that [0][0] is adjacent to [0][1] there is no defined memory relationship between them because there might be padding between. Come on, if you index past the end of an array, you are ALWAYS taking your life in your own hands. Solid code doesn't.
Southern Hospitality
@Steve, I made some test programs; this is blowing my mind. I'm going to have to check out some non-GCC compilers and see if I get consistent results.
Carl Norum
Southern, no one is saying that there can't be padding at the end of an array (I do think it is guaranteed, but it is not the topic of this discussion), nor is anyone saying that you can index beyond the bounds of an array (which you certainly can't, whether there's padding or not). The point is that the ints are ordered in memory a certain way, the element `a[0][0][1]` is closer to `a[0][0][0]` than `a[0][1][0]`. This layout is guaranteed and has in fact a major impact on performance of algorithms that traverse arrays.
avakar
[0][0] *is* adjacent to [0][1], the relationship is defined in 6.2.5/20, where arrays are stated to be contiguous. With `int x[3][5]` there might be padding between `x[0][4]` and `x[1][0]`, if `sizeof(int[5]) > 5 * sizeof(int)`, but if so it's exactly the same padding every time. The point is that everything is laid end to end, as a 1-D array of 1-D arrays. There is no "dereferencing a pointer at each dimension", except as described in 6.5.2.1/4, where a pointer-to-array is dereferenced to give an array, which immediately decays to pointer-to-first-element, and the next index is added.
Steve Jessop
+5  A: 

If you define:

int my_array[10][10][10];

One thing that's required is the meaning of the indices with respect to storage. my_array[1][2][3] is adjacent in memory to my_array[1][2][4], but not to my_array[1][3][3]. my_array[2][2][3] is further away still. This can affect performance as you increment one index or another - incrementing the last index is generally faster since you get more cache hits.

Example:

const int K = 400;
int my_array[K][K][K];

int main() {
    for (int i = 0; i < K; ++i) {
        for (int j = 0; j < K; ++j) {
            for (int k = 0; k < K; ++k) {
                #ifdef FAST
                    my_array[i][j][k] = 12;
                #else
                    my_array[k][j][i] = 12;
                #endif
            }
        }
    }
}

Output:

$ g++ looporder.cpp -o looporder && time ./looporder

real    0m2.500s
user    0m2.249s
sys     0m0.155s

$ g++ looporder.cpp -o looporder -DFAST && time ./looporder

real    0m0.516s
user    0m0.327s
sys     0m0.124s

$ g++ looporder.cpp -o looporder -O3 && time ./looporder

real    0m2.234s
user    0m2.140s
sys     0m0.093s

$ g++ looporder.cpp -o looporder -DFAST -O3 && time ./looporder

real    0m0.250s
user    0m0.171s
sys     0m0.108s

What the indices mean in terms of what's actually stored there is up to you - as you say it's just a question of "which way up" your cube is. So generally you choose them so that whatever you're incrementing in your innermost loop, is the last dimension.

Steve Jessop
Very nice illustration!
Hamish Grubijan