views:

165

answers:

2

I've been using Python's built-in cProfile tool with some pretty good success. But I'd like to be able to access more information such as how long I'm waiting for I/O (and what kind of I/O I'm waiting on) or how many cache misses I have. Are there any Linux tools to help with this beyond your basic time command?

+2  A: 

I'm not sure if python will provide the low level information you are looking for. You might want to look at oprofile and latencytop though.

sigjuice
+1  A: 

If you want to know exactly what you are waiting for, and approximately what percentage of the time, this will tell you. It won't tell you other things though, like cache misses or memory leaks.

Mike Dunlavey
Interesting approach! Kind of like a do-it-yourself sampling profiler.
Jason Baker
@Jason: Yeah, and in my opinion, better. The proof is in the pudding: http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773Profilers could do a better job if they not only sample the stack, but 1) retain the samples, 2) show you statements (not functions) sorted by % of samples containing them (ignore recursion), and 3) let you explore representative samples, not just summarizing.
Mike Dunlavey
... I have yet to see a case, story, or scenario, where a profiler was used to solve a series of performance problems resulting in dramatic speedup. People seem to be happy with mulling over numbers, exploring graphs, rummaging around in a time-consuming function, and finding and fixing maybe one problem.
Mike Dunlavey
@Mike Dunlavey: So you mean basically operate like every good commercial profiler does. (Quantify and vTune both show you individual statements, and can retain samples if you don't mind eating gobs of memory)
Nick Bastin
@Nick: Look at it from the other end. Suppose you've got an infinite loop, or nearly so. How many random time stack samples do you need in order to find it? One, right? because the object is to catch it while it's doing what it shouldn't, and find out what that is. Measuring isn't the object. Suppose it's only taking twice as long as it should, how many samples does it take? Basically, if you see it doing something it doesn't really have to do, on as few as *two* samples, you can fix that for a good speedup. (There is a small category of problems this will not reveal. Most, it does.)
Mike Dunlavey
@Nick: the problem with every good commercial profiler is this: they need to be sold, which means they have a product manager, whose job is to eliminate barriers to sales. That means if there's a potential customer who thinks it should do *foo*, by golly that becomes a requirement. Since programmers in general (not to mention managers) have some silly ideas about performance tuning, the commercial products simply reflect that. Here are some of the silly ideas: http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343
Mike Dunlavey
@Mike: You're massively generalizing all software processes to point out things that are bad about how some of them are implemented. Not all of them are bad, or think that they should change their entire app to get $500 (or even $50k) from a single client. Also a lot of the "disadvantages" or "problems" with gprof that are listed are because people are trying to use it to do things they shouldn't. If you ACTUALLY want to PROFILE, then the tools out there to do that are very good at what they do - if you didn't really want to profile, that's hardly the problem of the tool.
Nick Bastin
@Nick: I did generalize, maybe a bit too much. There's one profiler that looks like it actually grows out of real experience - Zoom. I'm told OProfile can do the job if you have enough experience to know what to have it do and not do. LTProf has the right idea underneath, but doesn't know what to display. I'd love a tool that would assist the manual method (I built one for DOS ages ago) but the manual method, while maybe clumsy looking, tells what the problems are before you can finish puzzling over measurement output.
Mike Dunlavey