3d convolution in c++

views:

380

answers:

+2 Q:

3d convolution in c++

Hello, I'm looking for some source code implementing 3d convolution. Ideally, I need C++ code or CUDA code. I'd appreciate if anybody can point me to a nice and fast implementation :-)

Cheers

+3 A:

you understand that convolution is normally done by using an fft? see, for example, http://en.wikipedia.org/wiki/Convolution

so you need an fft library.

http://stackoverflow.com/questions/1548809/fastest-method-to-compute-convolution suggests http://www.fftw.org/ (for a traditional cpu).

for cuda, use cufft - http://www.gsic.titech.ac.jp/~ccwww/tebiki/tesla%5Fe/tesla6%5Fe.html

andrew cooke 2009-12-23 00:24:30

For small kernels it can sometimes be faster to use matrix convolution, in cases where there is hardware to support it (eg, a GPU for 4x4 or 8x8 kernels). For big kernels, Fourier is da man for sure.

Crashworks 2009-12-23 01:00:20

FWIW, the original source for cufft docs is here: http://www.nvidia.com/object/cuda_develop.html

Steve Fallows 2009-12-23 01:03:05

Are you a registered developer? If so you should download the 3.0 SDK and check out the FDTD3d sample which shows a 3d convolution as applied for an explicit finite differences app. In the 2.3 SDK there was a sample called 3dfd which was similar (and has now been replaced).

It may be more efficient to use this approach rather than FFT if your impulse response is short.

Tom 2009-12-23 09:04:13

You can register at http://www.nvidia.com/object/cuda_get.html, click "Apply Now". Alternatively, you can just look at the 3dfd sample in the current SDK, the concepts remain the same.

Tom 2009-12-24 09:49:29

Hello, Actually, I'm planning to use a kernel with a small support (3x3 and 7x7 later on probably) so convolving might be faster than using the FFT. Anyway, I can use the FFTW library or the CUDA library. Does any of you know what's the speed up gained with the CUDA code ?

I'm not a registered developer so I don't have access to the 3.0 SDK. Can you point me to the web page to register ?

Thanks

2009-12-23 15:15:30

This should probably have been two comments on the answers, rather than a whole new answer! Apart from anything else then the original responders would have been alerted to your follow-up question.

Tom 2009-12-24 09:48:16

crashworks suggested using a gpu directly for small kernels. i have no experience with that, but i think you'd need to use opengl or similar rather than opencl (because the latter exposes a generic c-like interface) and you're probably restricted to floats (also true of opencl on some devices).

andrew cooke 2009-12-24 22:09:00

this look slike a good article on explicit convolution using opencl - http://developer.amd.com/gpu/ATIStreamSDK/ImageConvolutionOpenCL/Pages/ImageConvolutionUsingOpenCL.aspx (it probably has timing measurements somewhere).

andrew cooke 2009-12-24 22:13:20

i was wrong above. at least, cuda supports convolution with texture memory, so i suspect opencl will too.

andrew cooke 2009-12-30 11:49:35

Intel has a very good example - using SSE + OpenMP and a serial version of it. The code is primarily meant to profile the serial and a parallel approach, but is done in a nice way. http://software.intel.com/en-us/articles/16bit-3d-convolution-sse4openmp-implementation-on-penryn-cpu/

Sayan Ghosh 2010-04-14 17:09:43

ansaurus

tags:

views:

answers:

3d convolution in c++

related questions