views:

58

answers:

2

I use numpy for numerical linear algebra. I suspect that I can get much better performance if I make small modifications in how I carry out certain computations so that they are more memory efficient, for example.

I was wondering if there is any form of instrumentation available in python to detect cache and TLB misses. There is a very nice api, PAPI, that I learned about in a recent class but it doesn't have a Python interface:

http://icl.cs.utk.edu/papi/overview/index.html

Also, is there a good way in general to profile numpy or other python numerical code? The timeit module is hard to integrate into code. mpi4py has a nice way to profile using the MPE library. A snippet from demo code (demo/mpe-logging/cpilog.py):

communication   = MPE.newLogState("Comunicate",  "red")
with communication:
    comm.Bcast([n, MPI.INT], root=0)

A log file is created that can be displayed graphically. But this is a bit MPI specific.

Thanks.

+1  A: 

Maybe one of the provided profilers might help you find the hotspots?

see profiling python

These will probably not give enough detail to trigger direct action, but should indicate where to look for improvement and help to determine the point of diminishing returns.

Peter Tillemans
+1  A: 

Robert Kern (one of the NumPy devs) wrote line_profiler for exactly this scenario. It is more suited to profiling NumPy-heavy code than hotspot/cProfile.

dwf
Thanks, that is very helpful.
Paul