views:

552

answers:

7

While I was working on an University project, I used a project-internal profiler made by an elder student, it was very basic but good enough since its task was to subtract times between two points of the code and to give statistics.

Now, how does a professional profiler work? Does it preprocess the code to insert checkpoints or stuff like that? Does it read the binary code with debug data to catch where a function is called?

Thanks.

+2  A: 

It depends on type of code being analyzed, for example .NET CLR provides facility for code profilers. When dealing with managed code it is possible to rewrite intermediate code to inject custom hooks. Also you can analyze stack trace of the applications. Operating system can provide means for profiling, for example Windows has performance counters. When dealing with embedded code you can emulate/substitute underlying hardware to effectively monitor system performance.

aku
What do you mean by "managed code"?
tunnuz
http://en.wikipedia.org/wiki/Managed_code
aku
+5  A: 

There are two common profiling strategies (for VM-based languages anyway): instrumentation and sampling.

Instrumentation inserts checkpoints and informs the profiler every time a method starts and finished. This can be done by the JIT/interpreter or by a post-normal-compile but pre-execution phase which just changes the executable. This can have a very significant effect on the performance (thus skewing any timing results). It's good for getting accurate counts though.

Sampling asks the VM periodically what the stack trace looks like for all threads, and updates its statistics that way. This typically affects performance less, but produces less accurate call counts.

Jon Skeet
IMO, the best method is to capture a smaller # of stack traces. Then, for each stmt/instr on them report the % of samples containing it. The best points to examine are in that list, even if the time estimates are coarse. This is more useful than function timing.
Mike Dunlavey
+1  A: 

See this Wikipedia link on performance analysis.

Jared
A: 

for gprof in *nix, at compile and link time by using the -pg, some extra code is injected into the object code. Then by running gprof, a report file is generated by the injected code.

jscoot
+12  A: 

There are lots of different profilers which work in different ways.

Commonly used profilers simply examine the running program regularly to see what assembly instruction is currently being executed (the program counter) and which routines called the current function (the call stack). This kind of sampling profiler can work with standard binaries, but are more useful if you have debugging symbols to work out lines of code given addresses in the program.

As well as sampling regularly, you can also use processor performance counters to sample after a certain number of events such as cache misses, which will help you see which parts of your program are slowing down due to memory accesses.

Other profilers involve recompiling the program to insert instructions (known as instrumentation) to count how often each continuous set of instructions (basic blocks) are executed, or maybe even record the sequence in which basic blocks are executed, or record the content of variables at certain places.

The instrumentation approach can give you all the precision and data you might want, but will slow down the program and that will change its performance characteristics. By contrast, with sampling based approaches you can tune the performance impact against the length of time you need to run the program against the accuracy of the profile data you obtain.

Dickon Reed
+1  A: 

As Jon Skeet wrote above there are two strategies: instrumentation and sampling.

Instrumentation is done both manually and also automatically. In manual case: the developer manually inserts code to track the start/end of a region of code of interest. For example a simple "StartTimer" and "EndTimer". Some profiler tools can do this automatically also - for this the profiler will need to do a static analysis of the code i.e. it parses out the code and identify important checkpoints like the start/end of a particular method(s). This is most easy with languages that support reflection (e.g. any .net language). Using 'reflection' the profiler is able to rebuild the entire source code tree (along with call graphs).

Sampling is done by the profiler and it looks into the binary code. The profiler can also techniques like Hooks or trap Windows events/ messages for the purpose of profiling.

Both Instrumentation and sampling methods have their own overheads. The amount of overhead depends - e.g. if the sampling frequency is set to high values, then the profiling itself can contribute significantly to the performance being reported.

Instrumentation Vs Sampling: It is not like one is better than the other approach. Both have their place.

The best approach is to start with a sampling based profiler and look at the whole system level. That is run the sampler and see the system wide resource usage: memory, hard disk, network, CPU.

From the above identify the resources that are getting choked.

With the above info, you can now add instrumentation to your code to pin-point the culprit. For example if memory is the most used resource then it will help to instrument your memory allocation related code. Note that with instrumentation you are really concentrating on a particular area of your code.

Sesh
A: 

Can anyone recommend a good sampling based profiler? I'm tasked with finding the reasons behind many-minute delays when user clicks button X in an old legacy program in my office. It's a big honking thing that can't easily be re-placed so has to see what we can do to improve performance. ideally something that can run over a series of months as users use the program and collect VERY detailed usage data. commercial is fine, cost is not main criteria as long as it functions well.