ansaurus

Question

The Cost of SQRT in C

Answer 1

A:

Remember that you can benchmark these functions. This means running it a million (or whatever) times and timing how long it takes.

Alexander Rafferty 2010-09-25 12:19:26

How? What's the best way to lock into the actual time (besides holding a stopwatch in my hand?) Can you give me some code that will do this for me? But more to the point, I don't care so much about actual time cost. What I'm interested in is cycle counts - how expensive is a call like this? How do you determine expense?

dvanaria 2010-09-25 12:20:15

Try looking into the time.h header file. Note: I do not know if it is a C or C++ header.

Alexander Rafferty 2010-09-25 12:22:09

see this [page](http://www.cs.cf.ac.uk/Dave/C/node21.html) (Example 1).

Femaref 2010-09-25 12:23:19

Answer 2

+1 A:

I'm afraid, that the days, when you could determine the cost of a function call in clock cycles, are long gone (for most platforms).

It might still be possible for some simple microcontrollers, but desktop processors are way too advanced for that.

The cost is dependent on many factors. The processor has ability to reorder instructions and execute instructions of your function in parallel and also in parallel with other instructions, even from a different thread (if the core supports simultaneous multithreading). Targets of branches are "guessed" and executed before the condition is fully evaluated. If the "guess" happens to be wrong, computations are discarded. The guess is based on some statistics gathered by the processor. The difference in execution time in case the code is in cache and in case it isn't can be enormous. Even if you were able to determine, that the execution of the function took x cycles in some case, the next time you run the code you can get y and y might be very different than x.

If you have a simple microcontroller and the documentation says explicitly how many clock cycles each instruction takes, you can take disassembly of your function and add costs of all instructions it consists of.

Maciej Hehl 2010-09-25 12:25:58

True enough, but at the same time, I think it's missing the point. Basically, the OP wants to know "how guilty do I have to feel if I'm using sqrt inside an inner loop in my code?", and THAT we can give a reasonable answer to with naive methods.

leo grrr 2010-09-25 12:51:27

@leo grrr Sure, but I think making the OP realise, why the question about cycle cost can't get a precise answer has some value too. I'm tired of seeing "measure" everywhere, so I decided not to write it once more :)

Maciej Hehl 2010-09-25 13:05:13

Based on other comments the OP has left on some answers, I agree that you have correctly identified a need for information!

leo grrr 2010-09-25 13:26:37

Answer 3

+2 A:

With good inlining compiler, the cost of calling sqrt() is one coprocessor instruction: FSQRT. Cycles are indeed hard to determine with superscalar architectures.

alxx 2010-09-25 12:31:12

Answer 4

+1 A:

Square root converges quadratically when it does converge. The number of iterations depends on the initial guess used by the Newton-Raphson algorithm used to calculate it.

Here's a nice page that says it costs four iterations for 32 bits for four-digit accuracy and five iterations for 64 bits.

This is how it's done for situations where it's not done in hardware.

duffymo 2010-09-25 12:38:11

Answer 5

+2 A:

If you don't need anything really fancy, which you don't , you can just use localtime() in a loop.

There's actually code for exactly what you want right here! http://beige.ucs.indiana.edu/B673/node104.html

The basic idea is:

#include <time.h>
...
int main(){
    time_t t0, t1;
    t0 = time(NULL);
    for (int i=1; i <= huge number; i++) //to hopefully make up for statistical noise
        sqrt(i)
    t1 = time(NULL);
    // do the math and print
}

leo grrr 2010-09-25 12:56:42

A smart compiler will remove the `sqrt(i)` call altogether, you should do something with the result. see here in a question asked earlier today http://stackoverflow.com/questions/3793410/benchmarking-math-h-square-root-and-quake-square-root

st0le 2010-09-25 13:09:20

Actually, I just build it with `gcc -O2 -S` and it bottomed out in a fsqrt assembly call. Maybe it's not smart enough? Actually, since the compiler doesn't know what happens inside a function--it might have side effects--I don't expect it would remove that call. If I just did some math and then didn't do anything with it, I agree!

leo grrr 2010-09-25 13:24:51

Answer 6

+3 A:

First, you need to clearly define what you mean by "cycle cost". This could be interpreted as either algorithmic cycles or processor cycles. Once you have done that, the next step is detarmining the zen of your approach - theoretical or test.

First, theoretical.

If it is algorithmic cycles, I would make this determination via a code review. The steps would be

Determine the c library that you are using
Obtain the source code for that specific sqrt library implementation
Perform analysis of the source code to estimate complexity (cycle cost of algorithm)

But as others have posted, most instruction sets in use today will have a sqrt instruction, so I'd expect the source for the c implementation to just call this instruction - algorithmic cycles are expected to be constant.

If it is processor cycles, you would make this determination in a similar fashion, but some of the details would change:

Determine machine code that will run on the processor by doing 1, 2, and 3 above
Learn more about processors than I know
Probably realize this is futile and just test it (Other SO useful answer)

For a test approach you want to:

Determine what all environments you want this information for
1. Processors
2. Operating systems
Author a test application
1. Should perform all operations you want to test including anything outside of sqrt you need this information for (such as casting inputs, outputs)
2. Should perform a set number of iterations over a long period of time
3. Should perform a single instruction and exit and be called a set number of iterations of a long period of time
4. Really need to just give it thought, identify why you want this information and how you will use it, and come up with a test app that removes as many variables as possible.
Run application
1. Run in as close to the environment you care about
2. If you the end use of this information will be performance predictions for users - test in a representative environment (if there may be running 10 different web browser windows while watching a netflix movie stream, test under those conditions)
Do some sort of analysis on the data set to get a reasonable prediction.

PatrickV 2010-09-25 14:00:44

ansaurus

tags:

views:

answers:

The Cost of SQRT in C

related questions