Lets consider two almost identical codes:
First
for (int k=0;k<1000;k++)
{
for (int i=0;i<600;i++)
{
for (int j=0;j<600;j++)
{
tab[i][j] = i *j;
}
}
}
Second
for (int k=0;k<1000;k++)
{
for (int i=0;i<600;i++)
{
for (int j=0;j<600;j++)
{
tab[j][i] = i *j;
}
}
}
In the second one instead of tab[i][j] we have tab[j][i].
The first code is much faster.
Question
Why is the first code much faster?
My intuition
Is it because when a program tries to access a cell , firstly whole block that contains this cell is move to the cache, and then it is accessed via cache. Since array in memory is represented by consecutive cells, then in first case then in first case there is much less access to the memory than in the second case.