I'd like to improve the performance of convolution using python, and was hoping for some insight on how to best go about improving performance.
I am currently using scipy to perform the convolution, using code somewhat like the snippet below:
import numpy
import scipy
import scipy.signal
import timeit
a=numpy.array ( [ range(1000000) ] )
a.reshape(1000,1000)
filt=numpy.array( [ [ 1, 1, 1 ], [1, -8, 1], [1,1,1] ] )
def convolve():
global a, filt
scipy.signal.convolve2d ( a, filt, mode="same" )
t=timeit.Timer("convolve()", "from __main__ import convolve")
print "%.2f sec/pass" % (10 * t.timeit(number=10)/100)
I am processing image data, using grayscale (integer values between 0 and 255), and I currently get about a quarter of a second per convolution. My thinking was to do one of the following:
Use corepy, preferably with some optimizations Recompile numpy with icc & ikml. Use python-cuda.
I was wondering if anyone had any experience with any of these approaches ( what sort of gain would be typical, and if it is worth the time ), or if anyone is aware of a better library to perform convolution with Numpy.
Thanks!
EDIT:
Speed up of about 10x by re-writing python loop in C over using Numpy.