views:

279

answers:

2

I run the Visual Studio 2008 profiler on a "RelDebug" build of my app. Optimizations are on, but inlining is only moderate, stack frames are present, and symbols are emitted. In other words, RelDebug is a somewhat optimized build that can be debugged (although the usual Release caveats about inspecting variables applies).

I run both the Sampling, and the Instrumented profiler on separate runs.

Result? The Sampling profiler produces a result that looks reasonable. However when I look at the Instrumented profiler results, I see functions that should not even be near the top of the list, coming out up to.

For example, a function like "SetFont" that consists of only 1 line assigning the height to a class member. Or "SetClipRect" that merely assigns a rectangle.

Of course I am looking at "Exclusive" stats (i.e. minus children).

This happen to anyone else? It always seems to happen once my application has grown to a certain size. It makes the instrumented profiler useless at that point.

+1  A: 

You say "of course you are looking at Exclusive". Look at inclusive stats. In all but the simplest programs or algorithms, nearly all the time is spent in subroutines and functions, so if you've got a performance problem, it most likely consists of calls you didn't know were time-hogs.

The method I rely on is this. Assuming you are trying to find out what you could fix to make the code faster, it will find it, while not wasting your time with high-precision statistics about things that are not problems.

Mike Dunlavey
What I am saying is that functions that should under no circumstances be coming to the top, appear at the top. It seems like a bug in the Instrumented profiler. The Sampling profiler produces more accurate results that make sense. I've already optimized most of what can be optimized without making algorithmic changes (my app flies). I want to use the instrumenting profiler to figure out what is causing my app to slow down when using OpenMP and a debugger (without a debugger attached, the app does not slow down), but the incorrect result from the Instrumented profiler are getting in the way.
TheVinn
@TheVinn: If you think there's not much left to gain, maybe there's not, and maybe there is: http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773IMHO 1) it doesn't matter if your "profiling" technique slows down the app, as long as it accurately points out the statements that cost you the most, because when you fix them and stop diagnosing the speed will go back up, and 2) instrumentation can only tell you, at best, what % of time each routine is on the stack (and that is its exact cost). Sampling the stack ...
Mike Dunlavey
... tells you that at the statement or even instruction level so it is more informative, and it even makes it easier to figure out _why_ the statement is being called, which is how you know if you can replace it.
Mike Dunlavey
... Sorry to blather on. Accuracy of timing isn't important either, because what you really need to know is which statements are costing you and are they costing you enough to bother fixing. So whether it's 10% or 99.9% you still want to fix it. What you really need to know is what "it" is, and stack samples tell you that.
Mike Dunlavey
A: 

There's no bug. Sampling cannot tell you how much time you spent per call. Profiler is just counting how many times timer ended up in that specific function. Since SetFont is not frequently called, you don't get many hits in that function and you get impression that that function is not time consuming.

On the other hand, when you run instrumentation, profiler counts every call and measures execution time of every function. That is why you get accurate information about functions CPU consumption.

When examining instrumentation results you must always look at number of calls as well. Since SetFont is more-less API it doesn't matter if it's exclusive or inclusive. The only thing that matters is its overall time and how frequently it's called.

Aleksandar
Actually SetFont is my function and it just copies a small lightweight structure. It is not getting called very many times at all. And its not always the same function, often the routine changes. Its never a routine that makes sense (i.e. one of my workhorse functions) its always something almost unrelated to the performance profile.
TheVinn
Here's an example where none of this is any help. Suppose you have Main calling B, calling C, calling ... calling Z, which actually does some useful work, which happens to consist of file I/O so it's cpu time is negligible. The exclusive cpu time in all routines is negligible. Now, suppose where B calls C, it just happens to call it twice, once in a necessary way, and once in an unnecessary way. Wall-clock time doubles. The system is so big you have no idea how many times each function should be called, or how long they should take. The PC-sampler is useless.
Mike Dunlavey
... Instrumentation is useless. The call graph says nothing different after than it did before. However, if you just pause it a few times and each time examine the stack, half the time you will see the call (not the function, the _call_) to C that is not necessary. That's how you will know what to fix and about how much it will save. This may seem contrived, but in my long experience it is representative.
Mike Dunlavey