ansaurus

Question

Answer 1

+7 A:

Probably because 64 is the cache line size on your machine, and you basically run each iteration fully out of a single cache line.

Anteru 2009-01-23 09:54:27

Answer 2

+3 A:

I'm not sure what you're asking. Are you just checking what we know? Why bother? Or are you asking us because you can't explain the boost at 64bytes yourself? Or is it to find out if this is a good interview question, or...?

Anyway, I can't say I like your code. If the intention is simply to provide a background for your interview question, you should remove all unnecessary code. Is HeapAlloc important? Why couldn't the array be declared on the stack? Or with new/malloc?

Why do you need to do error handling in this little test program? Again, it just distracts and adds more noise. The same goes for the QPC calls. And I won't even ask why you need a precompiled header in this.

In order to ask the interviewee about a 6-line loop, he has to sift through 16 lines of irrelevant noise. Why?

And as mentioned in a comment, the output table is basically unreadable.

I'm all for seeing if an interviewee can read code, but I don't see the point making a question about performance and cache characteristics so hard to read.

In any case, at 64-byte, one INNER array fits exactly into most CPU's cache lines. Which means that exactly one cache line has to be read for each iteration of OUTER.

jalf 2009-01-23 10:41:50

I posted whole code here to let you just copy and compile it -- in case you wanted. On an actual interview, for sure, I show only the loop. Sorry if I hurt your feelings :)

Quassnoi 2009-01-23 19:48:13

Answer 3

A:

Just to clarify, some of the spaces in his table should actually be ,'s. you all can figure it out.

PiNoYBoY82 2009-01-23 16:45:07

ansaurus

tags:

views:

answers:

Cache, loops and performance

related questions