views:

949

answers:

6

Hi, i would to know how to write a profiler? What books and / or articles recommended? Can anyone help me please?

Someone has already done something like this?

+1  A: 

Why would you like to write a profiler? I'm just curious.

Do you want the experience and get to know how it works?

You want to make money selling it?

You cannot afford buying one(just kidding with this one)?

Edit: Sorry if this is unconstructive, but I am actually genuinely curious about why anyone would go through the trouble to reinvent the wheel for the umpteenth time.

Nailer
I wrote a web-aplication and now I have a problem, I have to be able to measure the efficiency of each method.
Agusti-N
NIH is a cruel mistress.
BraveSirFoobar
this is not reinvent the wheel but to get you to understand how a wheel works ;)
Agusti-N
I thought with that you're a user, not a programmer
Agusti-N
Some languages don't have profilers invented yet.
jevon
+2  A: 

JVMPI spec: http://java.sun.com/j2se/1.5.0/docs/guide/jvmpi/jvmpi.html

I salute your courage and bravery

EDIT: And as noted by user Boune, JVMTI: http://java.sun.com/developer/technicalArticles/Programming/jvmti/

Yoni Roit
+7  A: 

Encouraging lot, aren't we :)

Profilers aren't too hard if you're just trying to get a reasonable idea of where the program's spending most of its time. If you're bothered about high accuracy and minimum disruption, things get difficult.

So if yoyu just want the answers a profiler would give you, go for one someone else has written. If you're looking for the intellectual challenge, why not have a go at writing one?

I've written a couple, for run time environments that the years have rendered irrelevant.

There are two approaches

  • adding something to each function or other significant point that logs the time and where it is.

  • having a timer going off regularly and taking a peek where the program currently is.

The JVMPI version seems to be the first kind - the link provided by uzhin shows that it can report on quite a number of things (see section 1.3). What gets executed changes to do this, so the profiling can affect the performance (and if you're profiling what was otherwise a very lightweight but often called function, it can mislead).

If you can get a timer/interrupt telling you where the program counter was at the time of the interrupt, you can use the symbol table/debugging information to work out which function it was in at the time. This provides less information but can be less disruptive. A bit more information can be obtained from walking the call stack to identify callers etc. I've no idea if these is even possible in Java...

Paul.

Paul
+4  A: 

I would look at those open-source projects first:

Then I would look at JVMTI (not JVMPI)

Boune
+4  A: 
Mike Dunlavey
This sounds like a great approach for finding the bottleneck function in a program. But I don't see it obviating the need for a traditional profiler when you're doing low-level tuning. You do need precise measurements to validate your changes. If `Foo` took 50% before my change and 40% after, have I made a significant improvement or is it sampling error? And sometimes you need to nail down the bottleneck to the level of an if-branch that's triggering far more often than you'd expect. If it's inline code, a callstack won't give you that.
Adrian McCarthy
Mike Dunlavey
@Adrian McCarthy: ... If it's not a good "why", i.e. you can find a way not to spend it, then the % on that line doesn't just drop a little bit, it often disappears, and 1) the entire time decreases by that %, and 2) the distribution of what statements take time shifts. (There's no harm in measuring with a traditional profiler, but I've never needed to, because the overall time drops by the percent that the line or instruction was responsible for.)
Mike Dunlavey
@Adrian McCarthy: Not to belabor, but maybe you can see how precision of measurement never has to enter the process, because if a line of code (or a whole function) is on the stack 50% and is optimized to 40%, then it's easy to see the change because that 10% comes directly off of the overall time, where a simple stopwatch can measure it, without any question of accuracy.
Mike Dunlavey
A: 

As another answer, I just looked at LukeStackwalker on sourceforge. It is a nice, small, example of a stack-sampler, and a nice place to start if you want to write a profiler.

Here, in my opinion, is what it does right:

  • It samples the entire call stack.

Sigh ... so near yet so far. Here, IMO, is what it (and other stack samplers like xPerf) should do:

  • It should retain the raw stack samples. As it is, it summarizes at the function level as it samples. This loses the key line-number information locating the problematic call sites.

  • It need not take so many samples, if storage to hold them is an issue. Since typical performance problems cost from 10% to 90%, 20-40 samples will show them quite reliably. Hundreds of samples give more measurement precision, but they do not increase the probability of locating the problems.

  • The UI should summarize in terms of statements, not functions. This is easy to do if the raw samples are kept. The key measure to attach to a statement is the fraction of samples containing it. For example:

    5/20 MyFile.cpp:326 for (i = 0; i < strlen(s); ++i)

This says that line 326 in MyFile.cpp showed up on 5 out of 20 samples, in the process of calling strlen. This is very significant, because you can instantly see the problem, and you know how much speedup you can expect from fixing it. If you replace strlen(s) by s[i], it will no longer be spending time in that call, so these samples will not occur, and the speedup will be approximately 1/(1-5/20) = 20/(20-5) = 4/3 = 33% speedup. (Thanks to David Thornley for this sample code.)

  • The UI should have a "butterfly" view showing statements. (If it shows functions too, that's OK, but the statements are what really matter.) For example:

    3/20 MyFile.cpp:502 MyFunction(myArgs)
    2/20 HisFile.cpp:113 MyFunction(hisArgs)

    5/20 MyFile.cpp:326 for (i = 0; i < strlen(s); ++i)

    5/20 strlen.asm:23 ... some assembly code ...

In this example, the line containing the for statement is the "focus of attention". It occurred on 5 samples. The two lines above it say that on 3 of those samples, it was called from MyFile.cpp:502, and on 2 of those samples, it was called from HisFile.cpp:113. The line below it says that on all 5 of those samples, it was in strlen (no surprise there). In general, the focus line will have a tree of "parents" and a tree of "children". If for some reason, the focus line is not something you can fix, you can go up or down. The goal is to find lines that you can fix that are on as many samples as possible.

IMPORTANT: Profiling should not be looked at as something you do once. For example, in the sample above, we got a 4/3 speedup by fixing one line of code. When the process is repeated, other problematic lines of code should show up at 4/3 the frequency they did before, and thus be easier to find. I never hear of people talking about iterating the profiling process, but it is crucial to getting overall large compounded speedups.

P.S. If a statement occurs more than once in a single sample, that means there is recursion taking place. It is not a problem. It still only counts as one sample containing the statement. It is still the case that the cost of the statement is approximated by the fraction of samples containing it.

Mike Dunlavey