tags:

views:

2053

answers:

3

I am trying to implement a vision algorithm, which includes a prefiltering stage with a 9x9 Laplacian-of-Gaussian filter. Can you point to a document which explains fast filter implementations briefly? I think I should make use of FFT for most efficient filtering.

+2  A: 

Here is a theory link http://hebb.mit.edu/courses/9.29/2002/readings/c13-1.pdf

And here is a link to fftw, which is a pretty good FFT library that I've used in the past (check licenses to make sure it is suitable) http://www.fftw.org/

All you do is FFT your image and kernel (the 9x9 matrix). Multiply together, then back transform.

However, with a 9x9 matrix you may still be better doing it in real coordinates (just with a double loop over the image pixels and the matrix). Try both ways!

Greg Reynolds
+6  A: 

Are you sure you want to use FFT? That will be a whole-array transform, which will be expensive. If you've already decided on a 9x9 convolution filter, you don't need any FFT.

Generally, the cheapest way to do convolution in C is to set up a loop that moves a pointer over the array, summing the convolved values at each point and writing the data to a new array. This loop can then be parallelised using your favourite method (compiler vectorisation, MPI libraries, OpenMP, etc).

Regarding the boundaries:

  • If you assume the values to be 0 outside the boundaries, then add a 4 element border of 0 to your 2d array of points. This will avoid the need for `if` statements to handle the boundaries, which are expensive.
  • If your data wraps at the boundaries (ie it is periodic), then use a modulo or add a 4 element border which copies the opposite side of the grid (abcdefg -> fgabcdefgab for 2 points). **Note: this is what you are implicitly assuming with any kind of Fourier transform, including FFT**. If that is not the case, you would need to account for it before any FFT is done.

The 4 points are because the maximum boundary overlap of a 9x9 kernel is 4 points outside the main grid. Thus, n points of border needed for a 2n+1 x 2n+1 kernel.

If you need this convolution to be really fast, and/or your grid is large, consider partitioning it into smaller pieces that can be held in the processor's cache, and thus calculated far more quickly. This also goes for any GPU-offloading you might want to do (they are ideal for this type of floating-point calculation).

Phil H
Using a boundary of zeroes assumes the data is fairly white and zero-mean. Using a blur filter on non-zero mean data with zero boundary would lead to unwanted distortions on the edges.
Jukka Dahlbom
True enough. Using FFT would instead assume that the data wraps at the boundaries, which may also be wrong. The zeroes were to remove expensive ifs. I'll add something about boundaries.
Phil H
Jukka the boundary always suffers. You have to do something to account for it and Phil mentions a couple traditional methods. The only way to not suffer from the boundary is to do the 2d convolution and then crop by 4 pixels on all the sides of the image.
Trevor Boyd Smith
Trevor: First, the boundaries bit is an edit after Jukka's comment. Also, your cropping method will still cause problems if it is not true to set the boundary value to 0. If the background of an image was blue, for example, that would not work.
Phil H
A: 

Actually you don't need to use a FFT size large enough to hold the entire image. You can do a lot of smaller overlapping 2d ffts. You can search for "fast convolution" "overlap save" "overlap add".

However, for a 9x9 kernel. You may not see much advantage speedwise.

Mark Borgerding