tags:

views:

362

answers:

6

Edit: I've removed the faster/more efficient from the question title as it was misleading..my intention was not optimisation but understanding arrays. Sorry for the trouble!

int array[10][10], i, j;

for(i=0;i<10;i++)
{
    for(j=0;j<10;j++)
        std::cin>>array[i][j];
}

Versus

int array[10][10], i, j;

for(i=0;i<10;i++)
{
    for(j=0;j<10;j++)
        std::cin>>array[j][i];
}

I'm pretty sure the answer has to do with how arrays are implemented on a hardware level; that the [ ][ ] syntax is just a programmer's abstraction to aid visualisation/modelling. However, I forgot which of the above code accesses the memory block sequentially from start till end...

Thanks for all the answers...

Just to confirm my understanding, does this mean the first code is equivalent to

int array[10][10], k;

for(k=0;k<100;k++)
{
    std::cin>>*(array+k);
}
+4  A: 

The first one has better spatial locality. Because the first subscript is the index of a subarray, and the second subscript is the element within that subarray; so when you fix i and vary the j's, you are looking at elements that are all within a subarray and thus are close together; but when you fix j and vary the i's, you are looking at elements that are 10 (the length of a subarray) apart, so are very spread out.

newacct
+2  A: 

I agree that this is premature optimization, but...

C++ stores matrices in row major order. This will tend to cause the first syntax to be faster (on most hardware), since you're accessing the array in order in memory, and preserving locality in your data access.

For details on array storage, see this article.

Reed Copsey
Does this mean the first code is equivalent to cin>> *(array+k) where k loops from 0 to 100?
Yew Long
It's exactly: " cin>>*(array + 10*i + j)", which is effectively *(array+k). In the second case, it's not going in order, though.
Reed Copsey
Thank you! I think I understand now
Yew Long
+10  A: 

Aside from the fact that waiting on getting user input will be signifficantly slower than the array access, the first one is faster.

Check out this page on 2D array memory layout if you want more background on the subject.

With the second one you are checking A[0], A[10] ... A[1], A[11].

The first is going sequentially A[0], A[1], A[2] ..

JensenDied
This is what I wanted to confirm, thanks!
Yew Long
+1  A: 

The correct way to iterate an array in C++ is the first method. Iterating an array "outside-in" as you have done in the second example will tend to be slower, because on each iteration of the inner loop you are getting widely different memory locations. The array is laid out sequentially row-fashion, the first method iterates by the inner rows. This gives the code the best chance of effectively using the CPU cache, by loading a row at a time into the cache and not requiring to invalidate all the time.

1800 INFORMATION
+1  A: 

For small arrays its unlikly to make any difference!

However for larger arrays the first option will be faster as it will access the whole array in the sequence as it is physically stored in memory. Additionally this will allow optimising compilers access to a whole back of tricks to speed this up further.

The second option bounce all over memory for every single pass of the array, this will reduce your cache hit rate and in the worst case involve you in lots of I/O paging virtual memory on and out.

James Anderson
A: 

You always want to access memory locations in groups based on locality, so first option. Locality will come into play at three different layers:

  1. Processor cache. You want to access together memory location that fall into the same cache line. Reading the first element takes the cache miss read hit, but subsequent 2-3 reads fall into the already cached content.
  2. Translation Lookaside Buffers. You want to access locations in the same page (usually 4k) so that the virtual-to-physical translation is already in the processor tlb
  3. Virtual Pages. You want to access locations on the same page so that the page is kept in the process working set and not moved to the standby list or even swapped out

Things vary from processor to processor and from OS to OS, but only by small margins, unless you're talking some exotic platforms.

Remus Rusanu