ansaurus

Question

How do you profile a .net application taking into account the effect of the CPU cache?

Answer 1

A:

I may be misunderstanding your question, but I think the answer is simply to switch your profiler into a high-accuracy, low-detail mode. An example would be using ANTS Performance Profiler's new Sampling Mode:

http://www.simple-talk.com/community/blogs/andrewh/archive/2009/11/13/76420.aspx

Mel Harbour 2010-07-30 11:54:48

Thanks, Sampling Mode has not been in most .not profilers until now.

Ian Ringrose 2010-07-30 12:37:48

Yeah, see, that's where I would go the other way.

Mike Dunlavey 2010-07-30 16:51:23

Answer 2

A:

I have seen too many people spend a long timer speeding up loops that a profiler says are slow, when in real life the cpu cache makes them fast.

Some profilers are really good at nonsense like that.

What's your overall goal? Do you want the computations to complete in less wall-clock time?

If not, ignore this answer.

If so, you need to know what's causing wall-clock time to be spent that you can get rid of.

It's not about accuracy of timing. It's about accuracy of location. I suggest what you really need to know is which lines of code are both 1) responsible for a reasonable fraction of time being spent, and 2) that could be done better or not at all. That's what you need to know because if there are no such lines of code, then what are you going to optimize?

An excellent way to find such lines of code is any profiler that 1) takes samples, on wall-clock time (not cpu-time) of the call stack, and 2) tells you, for each line of code (not function) that appears on call stacks, what percent of stacks it appears on. Your candidate lines for optimization are among the lines having a large percent. (A couple non-.net examples: Zoom and LTProf.)

Frankly, the profiler I use is one you already have. I just pause the program while it's being slow and look at the stack. I don't need a lot of samples. In fact, if there's a line of code I could do without, if it appears on as few as two samples, I know it's worth fixing, and the fewer samples it took to get to that point, the bigger it is. Here's a more thorough explanation.

There are almost always multiple "bottlenecks". So I find a big one, fix it, and do it all again. What fixing a bottleneck does to the remaining bottlenecks is - it makes them bigger. This "magnification effect* allows you to keep going until there is simply no more speed to squeeze out.

Mike Dunlavey 2010-07-30 13:15:35

ansaurus

tags:

views:

answers:

How do you profile a .net application taking into account the effect of the CPU cache?

related questions