Yea.. it's slow. As to why in detail someone else who feels more confident can try to explain.
Want to speed it up ? here : http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/
Yea.. it's slow. As to why in detail someone else who feels more confident can try to explain.
Want to speed it up ? here : http://martin.ankerl.com/2007/10/04/optimized-pow-approximation-for-java-and-c-c/
If your code involves some heavy number-crunching, I wouldn't be too surprised that std::pow
is consuming 5% of the running time. Many numeric operations are very fast, so a slightly slower operation like std::pow
will appear to take more time relative to the other already-fast operations. (That would also account for why you didn't see much improvement switching to std::powf
.)
The cache misses are somewhat more puzzling, and it's hard to offer an explanation without more data. One possibility is that if your other code is so memory-intense that it gobbles up all the allocated cache, then it wouldn't be completely surprising that std::pow
is taking all the punches on the cache misses.
If you replace std::pow(var)
with another function, like std::max(var, var)
, does it still take up 5%? Do you still get all the cache misses?
I'm guessing no on time and yes on cache misses. Calculating powers is slower than many other operations (which are you using?). Calling out to code that's not in the cache will cause a cache miss no matter which function it is.
Can you give more information on the 'x' as well as the environment where pow is evaluated?
What you are seeing might be the hardware prefetchers at work. Depending on the profiler the allocation of the 'cost' of the different assembly instructions might be incorrect, it should be even more frequent on long latency instructions like the ones needed to evaluate pow.
Added to that, I would use a real profiler like VTune/PTU than the one available in any Visual Studio version.