ansaurus

Question

Answer 1

A:

Can you just use time.time() to profile the parts you are interested in? Just get the current time and subtract from the last time you made a measurement.

karpathy 2010-08-25 22:04:14

I already get the total timing, but "Processing took 5.382 seconds" isn't specific enough for finding performance bottlenecks. I use a rather large and branching chain of generators internally and intended to store user input and resulting performance for later analysis. I have several functions that take an average 0.000 seconds each call but might be called tens of thousands of times.

Daniel Beck 2010-08-25 22:13:11

In that case you can have an integer counter for every execution of those functions and only do a measurement every 1000th execution? You could measure chunks of code this way instead and narrow down the bottlenecks. I realize it may be a bit tedious, depending on the code.

karpathy 2010-08-25 22:31:03

Answer 2

+1 A:

It sounds like you're trying to profile each call to 'next' on the generator? If so, you could wrap your generator in a profiling generator. Something like this, where the commented off part will be sending the results to a log or database.

def iter_profiler(itr):
  itr = iter(itr)
  while True:
    try:
      start = time.time()
      value = itr.next()
      end = time.time()
    except StopIteration:
      break
    # do something with (end - stop) times here
    yield value

Then instead of instantiating your generator as generator() you would use iter_profiler(generator())

wybiral 2010-09-10 01:48:47

alternatively, you could apply a modified version of it as a decorator at the definition of the generator.

aaronasterling 2010-09-10 02:26:56

Basically, you're right. I already do count the milliseconds between before and after the `for` loop in my original post's second example, so I already have this information (although I count a bit more than you would, but a few millis don't matter). However, I care more for the "hot spots" in my code (where does computing time go to die?) than exactly which of the dozens of results took longer than the others. Therefore I was hoping for a `profile/cProfile` based solution (that does not generate one report per result). If there's a way to merge profiler reports, that would be the way to go.

Daniel Beck 2010-09-10 06:29:11

Answer 3

A:

I finally found a solution. Return value of profiling via here.

import cProfile
import pstats
import glob
import math

def gen():
    for i in range(1, 10):
        yield math.factorial(i)

class index(object):
    def GET(self):
        p = cProfile.Profile()

        it = gen()
        while True:
            try:
                nxt = p.runcall(next, it)
            except StopIteration:
                break
            print nxt

        p.print_stats()

index().GET()

I also could merge multiple such profiling results (once I start giving unique file names) via documentation and store/analyze them combined.

Daniel Beck 2010-09-10 20:39:08

ansaurus

tags:

views:

answers:

Profiling Python generators

related questions