ansaurus

Question

Answer 1

+16 A:

Gavin Miller 2009-10-22 22:39:20

... it's like a matrix of chairs in a cinema... visiting each chair by traversing row by row is faster than column by column...

egon 2009-10-22 22:45:12

Without Cache however, the order of traversing through Random-Access Memory (RAM) doesn't matter (assuming all the array is on the RAM) - "The word random thus refers to the fact that any piece of data can be returned in a constant time, regardless of its physical location and whether or not it is related to the previous piece of data.[1]" http://en.wikipedia.org/wiki/Random-access_memory

Liran Orevi 2009-10-23 00:10:36

Answer 2

+1 A:

It is very likely related to the cache hits/misses. The difference lies in sequential vs scattered access that lies in size above the size of one cache line.

For plain c++ loops, it would also help to make the loops backwards to gain a bit of performance on the loop. Not sure how it fits for .NET.

jdehaan 2009-10-22 22:40:16

Why does it help to make the loops backwards?

Liran Orevi 2009-10-23 00:11:21

If you have a look at the assembly code the test is easier. When looping down to 0 the test is easy because you decrement and test the Z flag of the CPU. By comparing to another limit you have to add an extra CMP (for X86 CPUs as an example)

jdehaan 2009-10-23 05:18:32

Answer 3

+4 A:

Locality, locality, locality of data. From Wikipedia (which says it better than I would have):

Linear data structures: Locality often occurs because code contains loops that tend to reference arrays or other data structures by indices. Sequential locality, a special case of spatial locality, occurs when relevant data elements are arranged and accessed linearly. For example, the simple traversal of elements in a one-dimensional array, from the base address to the highest element would exploit the sequential locality of the array in memory.[2] The more general equidistant locality occurs when the linear traversal is over a longer area of adjacent data structures having identical structure and size, and in addition to this, not the whole structures are in access, but only the mutually corresponding same elements of the structures. This is the case when a matrix is represented as a sequential matrix of rows and the requirement is to access a single column of the matrix.

Ed Swangren 2009-10-22 22:40:33

Answer 4

A:

I remember reading about this in Code Complete. In most languages, arrays are set up with the last index set up sequentially, so you're accessing bytes directly in a row when iterating over last index, instead of skipping around when iterating over the first.

Kaleb Brasee 2009-10-22 22:41:25

The last index is the one where the data would be sequentially ordered, not the first.

Mike Daniels 2009-10-22 22:42:59

Ah yeah, you're right.

Kaleb Brasee 2009-10-22 22:51:43

Answer 5

+1 A:

Your intuition is right, it is a caching issue. @Mike Daniels post to question below is essentially describing the exact same issue. The second bit of code will get far more cache hits.

http://stackoverflow.com/questions/997212/fastest-way-to-loop-through-a-2d-array

But, shhhh we're not supposed to care about performance right? :)

BobbyShaftoe 2009-10-22 22:42:46

This code is being written for a performance competition in C#, so it's absolutely crucial. Can't believe I didn't think of memory storage.

Qua 2009-10-22 22:52:57

@Qua, yeah I was just being facetious. The current party line among many people seems to be that performance no longer matters. But that's just silly.

BobbyShaftoe 2009-10-23 00:08:32

Answer 6

A:

I would also think that the relative sizes of arrays a and b would make a difference.

If a.length is large and b.length is small, the second option should be faster. Conversely, if a.length is small and b.length is large, the first option would be faster. The issue is avoiding the setup/teardown cost of the inner loop.

BTW, why do you have

int aLen = a.Length;

But then also call a.Length directly? Seems like you should choose one or the other.

Slaggg 2009-10-22 22:44:35

While profiling the code trying to figure out what was happening I played around with caching the array lengths, what you're seeing are scattered pieces of that attempt. There was no optimization gain, so I eventually got rid of it.

Qua 2009-10-22 22:52:06

Why if a.length is large and b.length is small, the second option should be faster?

Liran Orevi 2009-10-23 00:14:04

ansaurus

tags:

views:

answers:

Why Does This Improve Performance?

related questions