views:

139

answers:

2

Hi everyone,

I was wondering where would one go about getting CPU opcode cycle counts for various machines. An example of what I'm talking about can be seen at this link:

http://www.obelisk.demon.co.uk/6502/reference.html

If you examine the MAME source code, especially under src\emu\cpu, you'll see that most of the CPU models keep a track of the cycle count in a similar way. My question is where does one go about getting this information, or reverse engineering it if its not available? I've never seen any 'official' ASM programmer's guide contain cycle count info. My initial guess is that a small program is thrown into the real hardware's bootrom, and if it contains an opcode equivalent to RDTSC, something like this is done:

RDTSC

//opcode of choosing

RDTSC

But what would you do if such support wasn't available? I know for older hardware the MAME team has no access to anything but the roms, and scattered documentation.

Thanks in advance!

+3  A: 

Up through about the Pentium, cycle counts were easy to find for Intel and AMD processors (and most competitors). Starting with the Pentium Pro and AMD K5, however, the CPU went to a dynamic execution model, in which instructions can be executed out of order. In this case, the time taken to execute an instruction depends heavily upon the data it uses, and whether (for example) it depends on data from a previous instruction (in which case, it has to wait for that instruction to complete before it can execute).

There are also constraints on things like how many instructions can be decoded per cycle (e.g. at least one, plus two more as long as they're "simple") and how many can be retired per cycle (usually around three or four).

As a result, on a modern CPU it's almost meaningless to talk about the cycles for a given instruction in isolation. Meaningful results require a stream of instructions, so you look not only at that instruction, but what comes before and after it. An instruction that's a serious bottleneck in one instruction stream might be essentially free in another stream (e.g. if you have one multiplication mixed in with a lot of adds, the multiplication might be almost free -- but if it's surrounded by a lot of other multiplications, it might be relatively expensive).

Jerry Coffin
A: 

The accepted RDTSC count should have a serializing instruction to ensure that all previous instructions have retired before getting the count. This adds overhead to the count, but you can simply "count" zero instructions and subtract that value from the measured instructions.

Some pdf manuals that cover this very well.

http://www.agner.org/optimize/#manuals

Arthur Kalliokoski