views:

53

answers:

2

All the Intel CPUs in the last decade (at least) include a set of performance monitors that count a variety of events. Do the latest Intel CPUs, Core i3, i5 and i7 (aka Nehalem) provide a mechanism to count Instructions Per Clock (IPC)? If so, how are they used?

If this is possible, I'll probably be writing the code for this in Assembly, but Windows or Linux system calls may also come in useful.

A: 

IPC is getting meaningless with the current crop of multiple-instructions-per-clock commands.

From i7 propoganda document:

The chip boasted a wider execution core, allowing the processor to complete up to four full instructions simultaneously, along with a more efficient 14-stage pipeline improving IPC (instructions per clock) in comparison to Pentium 4/D

Those IPC counts all depend on the type of code that is being executed.

Dekker500
That's why you would *MEASURE* it, instead of looking it up in the CPU spec sheet.
Ben Voigt
A: 

There are two interesting things counfounding your calculations. They aren't fatal but you should keep them in mind.

One is variable clocking. You may know how many instructions are being executed but if the CPU speed changes under you (say due to waiting for IO) those instructions won't be as fast as those when the CPU is full-out calculating.

The other is related to the clocking. You may be able to count the number of cycles since booth with a rdtsc or a similar instruction but since the core frequency is changing you can't really turn that into a time unit easily.

I'd try a more typical benchmark to start though; an algorithmic fix will almost always have a lot more gas in in than an assembly optimization.

Paul Rubel
Don't forget TurboBoost, too: http://en.wikipedia.org/wiki/Intel_Turbo_Boost
Gabe
I'm not interested in instructions per second, but specifically in instructions per cycle. Anyway, if this turns out to be a problem I can always control the frequency and disable turbo.
Nathan Fellman