Implementing a low pass FIR filter, when should one use FFT and IFFT instead of time-domain convolution?
The goal is to achieve the lowest CPU time required for real-time calculations. As I know, FFT has about O(n log n) complexity, but convolution in the time domain is of O(n²) complexity. To implement a low pass filter in the frequency domain, one should use FFT, then multiply each value with filtering coefficients (which are translated into frequency domain), then make IFFT.
So, the question is when it is justified to use frequency-based (FFT+IFFT) filtering instead of using direct convolution based FIR filter? Say, if one have 32 fixed-point coefficients, should FFT+IFFT be used or not? How about 128 coefficients? And so on...
Trying to optimize an existed source code (convolution-based FIR filter), I am totally confused, either I should to use FFT or just optimize it to use SSE or not.