I've been doing some performance testing in order to improve the performance of a pet project I'm writing. It's a very number-crunching intensive application, so I've been playing with Numpy as a way of improving computational performance.
However, the result from the following performance tests were quite surprising....
Test Source Code (Updated with test cases for hoisting and batch submission)
import timeit
numpySetup = """
import numpy
left = numpy.array([1.0,0.0,0.0])
right = numpy.array([0.0,1.0,0.0])
"""
hoistSetup = numpySetup +'hoist = numpy.cross\n'
pythonSetup = """
left = [1.0,0.0,0.0]
right = [0.0,1.0,0.0]
"""
numpyBatchSetup = """
import numpy
l = numpy.array([1.0,0.0,0.0])
left = numpy.array([l]*10000)
r = numpy.array([0.0,1.0,0.0])
right = numpy.array([r]*10000)
"""
pythonCrossCode = """
x = ((left[1] * right[2]) - (left[2] * right[1]))
y = ((left[2] * right[0]) - (left[0] * right[2]))
z = ((left[0] * right[1]) - (left[1] * right[0]))
"""
pythonCross = timeit.Timer(pythonCrossCode, pythonSetup)
numpyCross = timeit.Timer ('numpy.cross(left, right)' , numpySetup)
hybridCross = timeit.Timer(pythonCrossCode, numpySetup)
hoistCross = timeit.Timer('hoist(left, right)', hoistSetup)
batchCross = timeit.Timer('numpy.cross(left, right)', numpyBatchSetup)
print 'Python Cross Product : %4.6f ' % pythonCross.timeit(1000000)
print 'Numpy Cross Product : %4.6f ' % numpyCross.timeit(1000000)
print 'Hybrid Cross Product : %4.6f ' % hybridCross.timeit(1000000)
print 'Hoist Cross Product : %4.6f ' % hoistCross.timeit(1000000)
# 100 batches of 10000 each is equivalent to 1000000
print 'Batch Cross Product : %4.6f ' % batchCross.timeit(100)
Original Results
Python Cross Product : 0.754945
Numpy Cross Product : 20.752983
Hybrid Cross Product : 4.467417
Final Results
Python Cross Product : 0.894334
Numpy Cross Product : 21.099040
Hybrid Cross Product : 4.467194
Hoist Cross Product : 20.896225
Batch Cross Product : 0.262964
Needless to say, this wasn't the result I expected. The pure Python version performs almost 30x faster than Numpy. Numpy performance in other tests has been better than the Python equivalent (which was the expected result).
So, I've got two related questions:
- Can anyone explain why NumPy is performing so poorly in this case?
- Is there something I can do to fix it?