views:

121

answers:

2

I've been using cProfile to profile my code, and it's been working great. I also use gprof2dot.py to visualize the results (makes it a little clearer).

However, cProfile (and most other python profilers I've seen so far) seem to only profile at the function-call level. This causes confusion when certain functions are called from different places - I have no idea if call #1 or call #2 is taking up the majority of the time. This gets even worse when the function in question is 6 levels deep, called from 7 other places.

So my question is: how do I get a line-by-line profiling? Instead of this:

function #12, total time: 2.0s

I'd like to see something like this:

function #12 (called from somefile.py:102) 0.5s
function #12 (called from main.py:12) 1.5s

cProfile does show how much of the total time "transfers" to the parent, but again this connection is lost when you have a bunch of layers and interconnected calls.

Ideally, I'd love to have a GUI that would parse through the data, then show me my source file with a total time given to each line. Something like this:

main.py:

a = 1 # 0.0s
result = func(a) # 0.4s
c = 1000 # 0.0s
result = func(c) # 5.0s

Then I'd be able to click on the second "func(c)" call to see what's taking up time in that call, separate from the "func(a)" call.

Does that make sense? Is there any profiling library that collects this type of info? Is there some awesome tool I've missed? Any feedback is appreciated. Thanks!!

+5  A: 

I believe that's what Robert Kern's line_profiler is intended for. From the link:

File: pystone.py
Function: Proc2 at line 149
Total time: 0.606656 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   149                                           @profile
   150                                           def Proc2(IntParIO):
   151     50000        82003      1.6     13.5      IntLoc = IntParIO + 10
   152     50000        63162      1.3     10.4      while 1:
   153     50000        69065      1.4     11.4          if Char1Glob == 'A':
   154     50000        66354      1.3     10.9              IntLoc = IntLoc - 1
   155     50000        67263      1.3     11.1              IntParIO = IntLoc - IntGlob
   156     50000        65494      1.3     10.8              EnumLoc = Ident1
   157     50000        68001      1.4     11.2          if EnumLoc == Ident1:
   158     50000        63739      1.3     10.5              break
   159     50000        61575      1.2     10.1      return IntParIO

Hope that helps!

Joe Kington
Joe, this is *exactly* what I was looking for. I can just use a decorator, attach a LineProfiler() object to a few functions, and it'll spit out a line-by-line profile of the function. I really wish there was a graphical way to see the results, but this is a great start! Thanks!
Rocketmonkeys
+2  A: 

If you had an infinite loop, would you need a tool to find it?

You could just hit Ctrl-C and look at the stack trace. The problem is somewhere on the stack. Since the loop is infinite, the chance is 100% that you will catch it in the act.

Suppose the loop isn't infinite. Suppose it terminates, but it makes the program take 100 times longer than it should. Now how would you find it? The same way, right? When you hit Ctrl-C, the chance is 99/100 that you will catch it in the act.

Suppose the problem is only wasting 20% of the program's time. If you hit Ctrl-C, the chance is 20% that you will catch it in the act. In fact, if you hit Ctrl-C 10 or 20 times, you will catch it about that percent of the time.

You want line-level info? It's right there, on the stackshots. You want to know who's calling whom? Right there. You want to know why it's being done? Just read the stack.

So all you have to do is interrupt it a bunch of times. Anything that shows up on more than one stackshot, that you can get rid of or improve, will give you a nice speedup, guaranteed.

If you're worried about accuracy of measuring, that's a different issue. In the meantime, you found the problems, fixed them, and enjoyed the speedup. You can do this over and over, until you can't anymore, and then your program will be almost as fast as possible.

More to say on this.

Mike Dunlavey
My question is about profiling, not debugging. I haven't had any infinite loops I need to find. My goal is to identify performance slow-spots in fine detail, so I can see if there is low-hanging fruit to optimize. Breaking into a program repeatedly would be a very slow way to do this :)
Rocketmonkeys
@Rocketmonkeys: I assumed your concern, ultimately, was about finding "bottlenecks" so the code would run in less time. If not, sorry. As for the technique being very slow, your keyword is "would". Sometimes debating is preferred to just hitting the key.
Mike Dunlavey
@Rocketmonkeys: e.g. 10 samples of @Joe's code would show about 3 in lines 156-158. That draws your attention to those lines. Obviously line 157 does nothing, so can be deleted, giving about 10% speedup.
Mike Dunlavey