tags:

views:

1237

answers:

10

I want to calculate time elapsed during a function call in C, to the precision of 1 nanosecond.

Is there a timer function available in C to do it?

If yes please provide a sample code-snippet.

Pseudo code

Timer.Start()
foo();
Timer.Stop()
Display time elapsed in execution of foo()


Environment details: - using gcc 3.4 compiler on a RHEL machine

+1  A: 

Any timer functionality is going to have to be platform-specific, especially with that precision requirement.

The standard solution in POSIX systems is gettimeofday(), but it has only microsecond precision.

If this is for performance benchmarking, the standard way is to make the code under test take enough time to make the precision requirement less severe. In other words, run your test code for a whole second (or more).

unwind
+1  A: 

There is no timer in c which has guaranteed 1 nanosecond precision. You may want to look into clock() or better yet The POSIX gettimeofday()

Evan Teran
There is no timer in ANSI, yes, but POSIX2001 specifies `clock_gettime()`
ebencooke
so what you saying is that there was nothing incorrect about my answer, but that it could have mentioned something in addition? That's a very lame reason to downvote.
Evan Teran
A: 

I don't know if you'll find any timers that provide resolution to a single nanosecond -- it would depend on the resolution of the system clock -- but you might want to look at http://code.google.com/p/high-resolution-timer/. They indicate they can provide resolution to the microsecond level on most Linux systems and in the nanoseconds on Sun systems.

tvanfosson
+4  A: 

On Intel and compatible processors you can use rdtsc instruction which can be wrapped into an asm() block of C code easily. It returns the value of a built-in processor cycle counter that increments on each cycle. You gain high resolution and such timing is extremely fast.

To find how fast this increments you'll need to calibrate - call this instruction twice over a fixed time period like five seconds. If you do this on a processor that shifts frequency to lower power consumption you may have problems calibrating.

sharptooth
A: 

Making benchmarks on this scale is not a good idea. You have overhead for getting the time at the least, which can render your results unreliable if you work on nanoseconds. You can either use your platforms system calls or boost::Date_Time on a larger scale [preferred].

soulmerge
+3  A: 

May I ask what kind of processor you're using? If you're using an x86 processor, you can look at the time stamp counter (tsc). This code snippet:

#define rdtsc(low,high) \
     __asm__ __volatile__("rdtsc" : "=a" (low), "=d" (high))

will put the number of cycles the CPU has run in low and high respectively (it expects 2 longs; you can store the result in a long long int) as follows:

inline void getcycles (long long int * cycles)
{
  unsigned long low;
  long high;
  rdtsc(low,high);
  *cycles = high; 
  *cycles <<= 32; 
  *cycles |= low; 
}

Note that this returns the number of cycles your CPU has performed. You'll need to get your CPU speed and then figure out how many cycles per ns in order to get the number of ns elapsed.

To do the above, I've parsed the "cpu MHz" string out of /proc/cpuinfo, and converted it to a decimal. After that, it's just a bit of math, and remember that 1MHz = 1,000,000 cycles per second, and that there are 1 billion ns / sec.

FreeMemory
This is what we use where I work for that purpose.
T.E.D.
+1  A: 

Can you just run it 10^9 times and stopwatch it?

Mike Dunlavey
I knew that would get a drive-by downvote. It's too simple.
Mike Dunlavey
Actually, its not a bad idea in a pinch. The problem with it is that your answer may end up being much quicker than you'd get if you could just time it once, due to not having to refetch things into cache every time.
T.E.D.
@T.E.D.: Yeah, there is that issue, that confuses all kinds of timing measurements - i.e. do you want to time it before or after it warms up?
Mike Dunlavey
A: 

Use clock_gettime(3). For more info, type man 3 clock_gettime. That being said, nanosecond precision is rarely necessary.

ebencooke
A: 

You can use standard system calls like gettimeofday, if you are certain that your process gets 100% if the CPU time. I can think of many situation in which, while you are executing foo () other threads and processes might steal CPU time.

Alphaneo
A: 

You are asking for something that is not possible this way. You would need HW level support to get to that level of precision and even then control the variables very carefully. What happens if you get an interrupt while running your code? What if the OS decides to run some other piece of code?

And what does your code do? Does it use RAM memory? What if your code and/or data is or is not in the cache?

In some environments you can use HW level counters for this job provided you control those variables. But how do you prevent context switches in Linux?

For instance, in Texas Instruments' DSP tools (Code Composer Studio) you can profile the code very exactly because the whole debugging environment is set such that the emulator (e.g. Blackhawk) receives info about every operation run. You can also set watchpoints which are coded directly into a HW block inside the chip in some processors. This works because the memory lanes are also routed to this debugging block.

They do offer functions in their CSL's (Chip Support Library) which are what you are asking for with the timing overhead being a few cycles. But this is only available for their processors and is completely dependant on reading the timer values from the HW registers.

Makis