tags:

views:

2191

answers:

6

In C, is there a difference in time and space between an m x n 2-dimensional array vs a 1-dimensional array of length mxn (for large values of m and n)? Will accessing elements be faster with a 1-dimensional array?

+2  A: 

I don't think there's any difference. Internally, c treats a two dimensional array like several one-dimensional arrays in sequence.

However, as with all things performance, your mileage may vary. There may be some kind of subtle pointer arithmetic difference. Run timed tests on both scenarios. Whichever one runs faster wins.

Robert Harvey
Can't you also implement arrays in C as e.g., `int **array` and then you `array = malloc(sizeof(int*)*rows); for (i = 0; i < rows; ++i) { array[i] = malloc(sizeof(int) * cols); }` with the advantage being that while access and creation is slower, adding rows is much quicker?
derobert
@derobert: That structure is often called a "ragged array". It has similar syntactic access, but really isn't the same as an ordinary 2d array.
dmckee
A: 

I'm only guessing, but I would say that a 1d array is faster than a 2d array. However, it wouldn't be measurably faster. Kind of like $1,000,000.01 is more than $1,000,000.

I'd use whatever is easier to code with.

dan gibson
+3  A: 

Actually, If you use the so called two-dimensional array in C the compiler will do the mapping into one-dimensional array for you. If you use one-dimensional array and you like to treat it as a two-dimensional one then you have to write the mapping yourself. The only thing that you have to take care of is you should access the array row-wise, because the C-compiler will store your two-dimensional array row after row. If you access a 'big' two-dimensional array column-wise then page-faults are likely to happen. Even if you are programming in language that supports only one-dimensional arrays, you could easily write the mapping into any number of dimensions. Take a look at this Wikipedia article if you want to do the mapping row-wise. Your mapping could be column-wise, like FORTRAN matrices for example.

AraK
+8  A: 

In C, 2-dimensional arrays are just a neat indexing scheme for 1-dimensional arrays. Just like with a 1D array, 2D arrays allocate a single block of contiguous memory, and the A[row][col] notation is just like saying A[row*NCOLS+col].

Usually if you were to implement your own multidimensional arrays using single dimensional arrays, you'd write an indexing function:

int getIndex(int row, int col) return row*NCOLS+col;

Assuming your compiler inlines this function, the performance here would be precisely the same as if you used the built in 'indexing function' of 2D arrays.

Edit: more example code for clarity...

#define NROWS 10
#define NCOLS 20

This:

int main(int argc, char *argv[]) {
    int myArr[NROWS*NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[getIndex(i,j)] = i+j;
       }
    }
    return 0;
}

Should perform the same as this:

int main(int argc, char *argv[]) {
    int myArr[NROWS][NCOLS];
    for (int i=0; i<NROWS; ++i) {
       for (int j=0; j<NCOLS; ++j) {
          myArr[i][j] = i+j;
       }
    }
    return 0;
}

Though as AraK pointed out, if you are jumping around rows alot, and the rows are very large, you could hit a lot of page faults... in that case the custom indexing function (with rows and cols switched around) could help, but so could simply changing which of the dimensions in a 2-dimensional array you treat as the rows and which you treat as the columns.

David Claridge
I think you mean row*NCOLS there!
caf
woops, thanks caf, fixed :)
David Claridge
Fix it in the first paragraph too.
Ree
+2  A: 

Robert is correct. Indexing expressions are compiled to pointer arithmetic expressions and so there's no difference.

What can have an impact is access order however, and so you may want to implement things yourself so you can control the access order. For example column first vs row first forms.

On modern processors, accessing large arrays at various strides can have unexpected performance differences. Sequential access is always fastest, and other strides can be up to 30 times slower due to cache interactions. Multi dimensional arrays where the inner dimensions are a power of two often have poor performance because of they way they interact with the cache associativity. To understand these issues there's no real substitute for doing measurement.

Jason Watkins
You don't need to write your own to decide the layout; the layout of the C built-in is well-defined. You can select which to use by writing `array[row][column]` vs. `array[column][row]`
derobert
Yes, that's true. I should have given a better example such as the various schemes for blocked matrixes.
Jason Watkins
A: 

As said by other, the difference really is how you access your items: what matters if how your items are layout in the memory, which is linear, at least on common architectures. So all you have really is 1d array, the 2d, etc... is "just" a convenience, and a reasonable compiler should optimize the indexing - but in practice, once you have more than a few variables, compilers often fail on arch like x86 because of the register starvation.

Now, it depends on your application, but I think you should think with a 1d layout by default, especially, if you need to handle multiple dimensions. The first problem with multi-dimensional arrays in C is that you cannot dynamically allocate them - if you allocate on a per-row basis, you will have awful performances because you don't have a contiguous piece of memory. See the FFTW doc for details about this.

Note that you can always describe your single piece of memory with convenient array indexing on top of it (you allocate one big nxm memory block, and then create an array of n pointer to each row).

David Cournapeau