ansaurus

Question

Answer 1

A:

Most of your data reads come from the loop variable i.

21 from the conditional i<20
20 reads from i++.
20 reads from i in the lvalue arr[0][i].

I'm not up to date on how cache works, but assuming 32 bit int array, your writes cover 10 cache lines. Wild guess: the last two lines are your write misses as it somehow doesn't predict your next write.

If you unroll the loop, you will see the counts collapse to small numbers.

arr[0][0]=0; 
arr[0][1]=0;
..

Erik Olson 2010-10-26 16:25:51

Answer 2

A:

I think the data mentioned with the above text may be erroneous as it was picked from inside a large code, thus there were effects from other variables as well.

anup 2010-10-26 19:05:28

I was able to reproduce your counts.

Erik Olson 2010-10-26 22:11:17

Answer 3

A:

As Erik Olson says, the 41 reads in the for line are all for i - 21 in the i < 20 test, and 20 in the i++ (if you compile with optimisation, these should reduce).

There are two L2 write misses because your 20 integers cover 80 bytes, which is (at best) two cache lines. Depending on the alignment of the array, it might cover 3 cache lines, which would cause three write misses.

caf 2010-10-27 11:41:06

ansaurus

tags:

views:

answers:

Valgrind output interpretation

related questions