views:

71

answers:

3

I want to write a bunch of optimizations for gcc using genetic algorithms. I need to measure execution time of an assembly functions for some stats and fit functions. The usual time measurement can't be used, 'cause it is influenced by the cache size.
So I need a table where I can see something like this.

command | operands | operands sizes | execution cycles

Am I missunderstanding something? Sorry for bad English.

+1  A: 

With modern CPU's, there are no simple tables to look up how long an instruction will take to complete (although such tables exist for some old processors, e.g. 486). Your best information on what each instruction does and how long it might take comes from the chip manufacturer. E.g. Intel's documentation manuals are quite good (there's also an optimisation manual on that page).

On pretty much all modern CPU's there's also the RDTSC instruction that reads the time stamp counter for the processor on which the code is running into EDX:EAX. There are pitfalls with this also, but essentially if the code you are profiling is representative of a real use situation, its execution doesn't get interrupted or shifted to another CPU core, then you can use this instruction to get the timings you want. I.e. surround the code you are optimising with two RDTSC instructions and take the difference in TSC as the timing. (Variances on timings in different tests/situations can be great; statistics is your friend.)

PhiS
+1  A: 

You can instrument your code using assembly (rdtsc and friends) or using a instrumentation API like PAPI. Accurately measuring clock cycles that were spent during the execution of one instruction is not possible, however - you can refer to your architecture developer manuals for the best estimates.

In both cases, you should be careful when taking into account effects from running on a SMP environment.

Michael Foukarakis