views:

97

answers:

3

The following program loads two images with PyGame, converts them to Numpy arrays, and then performs some other Numpy operations (such as FFT) to emit a final result (of a few numbers). The inputs can be large, but at any moment only one or two large objects should be live.

A test image is about 10M pixels, which translates to 10MB once it's greyscaled. It gets converted to a Numpy array of dtype uint8, which after some processing (applying Hamming windows), is an array of dtype float64. Two images are loaded into arrays this way; later FFT steps result in an array of dtype complex128. Prior to adding the excessive gc.collect calls, the program memory size tended to increase with each step. Additionally, it seems most Numpy operations will give a result in the highest precision available.

Running the test (sans the gc.collect calls) on my 1GB Linux machine results in prolonged thrashing, which I have not waited for. I don't yet have detailed memory use stats -- I tried some Python modules and the time command to no avail; now I'm looking into valgrind. Watching PS (and dealing with machine unresponsiveness in the later stages of the test) suggests a maximum memory usage of about 800 MB.

A 10 million cell array of complex128 should occupy 160 MB. Having (ideally) at most two of these live at one time, plus the not-insubstantial Python and Numpy libraries and other paraphernalia, probably means allowing for 500 MB.

I can think of two angles from which to attack the problem:

  • Discarding intermediate arrays as soon as possible. That's what the gc.collect calls are for -- they seem to have improved the situation, as it now completes with only a few minutes of thrashing ;-). I think one can expect that memory-intensive programming in a language like Python will require some manual intervention.

  • Using less-precise Numpy arrays at each step. Unfortunately the operations that return arrays, like fft2, do not appear to allow the type to be specified.

So my main question is: is there a way of specifying output precision in Numpy array operations?

More generally, are there other common memory-conserving techniques when using Numpy?

Additionally, does Numpy have a more idiomatic way of freeing array memory? (I imagine this would leave the array object live in Python, but in an unusable state.) Explicit deletion followed by immediate GC feels hacky.

import sys
import numpy
import pygame
import gc


def get_image_data(filename):
    im = pygame.image.load(filename)
    im2 = im.convert(8)
    a = pygame.surfarray.array2d(im2)
    hw1 = numpy.hamming(a.shape[0])
    hw2 = numpy.hamming(a.shape[1])
    a = a.transpose()
    a = a*hw1
    a = a.transpose()
    a = a*hw2
    return a


def check():
    gc.collect()
    print 'check'


def main(args):
    pygame.init()

    pygame.sndarray.use_arraytype('numpy')

    filename1 = args[1]
    filename2 = args[2]
    im1 = get_image_data(filename1)
    im2 = get_image_data(filename2)
    check()
    out1 = numpy.fft.fft2(im1)
    del im1
    check()
    out2 = numpy.fft.fft2(im2)
    del im2
    check()
    out3 = out1.conjugate() * out2
    del out1, out2
    check()
    correl = numpy.fft.ifft2(out3)
    del out3
    check()
    maxs = correl.argmax()
    maxpt = maxs % correl.shape[0], maxs / correl.shape[0]
    print correl[maxpt], maxpt, (correl.shape[0] - maxpt[0], correl.shape[1] - maxpt[1])


if __name__ == '__main__':
    args = sys.argv
    exit(main(args))
+1  A: 

if I understand correctly, you are calculating a convolution between two images. The Scipy package contains a dedicated module for that (ndimage), which might be more memory efficient than the "manual" approach via Fourier transforms. It would be good to try using it instead of going through Numpy.

EOL
I'll certainly investigate that for the larger project (one goal of which is *getting stuff done*; though there's also the goal of *trying new stuff*). But I imagine I'll be using Numpy a lot, so I'm still interested in techniques that can help with this problem.
Edmund
+1  A: 

This on SO says "Scipy 0.8 will have single precision support for almost all the fft code", and SciPy 0.8.0 beta 1 is just out.
(Haven't tried it myself, cowardly.)

Denis
A: 

As Denis has said, there is support for single precision in SciPy 0.8 in the fftpack module. As it is implemented, the module figures out whether to use single or double precision from the input array. If the input is a single-precision array then the resultant FFT will be single-precision.

Justin Peel