views:

86

answers:

1

I have a real vector time series x of length T and a filter h of length t << T. h is a filter in fourier space, real and symmetric. It is approximately 1/f.

I would like to filter x with h to get y.

Suppose t == T and FFT's of length T could fit into memory (neither of which are true). To get my filtered x in python, I would do:

import numpy as np
from scipy.signal import fft, ifft

y = np.real( np.ifft( np.fft(x) * h ) ) )

Since the conditions don't hold, I tried the following hack:

  1. Select a padding size P < t/2, select a block size B such that B + 2P is a good FFT size
  2. Scale h via spline interpolation to be of size B + 2P > t (h_scaled)
  3. y = []; Loop:
    • Take block of length B + 2P from x (called x_b)
    • Perform y_b = ifft(fft( x_b ) * h_scaled)
    • Drop padding P from either side of y_b and concatenate with y
    • Select next x_b overlapping with last by P

Is this a good strategy? How do I select the padding P in a good way? What is the proper way to do this? I don't know much signal processing. This is a good chance to learn.

I am using cuFFT to speed things up so it would be great if the bulk of the operations are FFTs. The actual problem is 3D. Also, I am not concerned about artifacts from an acausal filter.

Thanks, Paul.

+5  A: 

You're on the right track. The technique is called overlap-save processing. Is t short enough that FFTs of that length fit in memory? If so, you can pick your block size B such that B > 2*min(length(x),length(h)) and makes for a fast transform. Then when you process, you drop the first half of y_b, rather than dropping from both ends.

To see why you drop the first half, remember that the spectral multiplication is the same as circular convolution in the time domain. Convolving with the zero-padded h creates weird glitchy transients in the first half of the result, but by the second half all the transients are gone because the circular wrap point in x is aligned with the zero part of h. There's a good explanation of this, with pictures, in "Theory and Application of Digital Signal Processing" by Lawrence Rabiner and Bernard Gold.

It's important that your time domain filter taper to 0 at least on one end so that you don't get ringing artifacts. You mention that h is real in the frequency domain, which implies that it has all 0 phase. Usually, such a signal will be continuous only in a circular fashion, and when used as a filter will create distortion all through the frequency band. One easy way to create a reasonable filter is to design it in the frequency domain with 0 phase, inverse transform, and rotate. For instance:

def OneOverF(N):
    import numpy as np
    N2 = N/2; #N has to be even!
    x = np.hstack((1, np.arange(1, N2+1), np.arange(N2-1, 0, -1)))
    hf = 1/(2*np.pi*x/N2)
    ht = np.real(np.fft.ifft(hf)) # discard tiny imag part from numerical error
    htrot = np.roll(ht, N2)
    htwin = htrot * np.hamming(N)
    return ht, htrot, htwin

(I'm pretty new to Python, please let me know if there's a better way to code this).

If you compare the frequency responses of ht, htrot, and htwin, you see the following (x-axis is normalized frequency up to pi): 1/f filters with length 64

ht, at the top, has lots of ripple. This is due to the discontinuity at the edge. htrot, in the middle, is better, but still has ripple. htwin is nice and smooth, at the expense of flattening out at a slightly higher frequency. Note that you can extend the length of the straight-line section by using a bigger value for N.

I wrote about the discontinuity issue, and also wrote a Matlab/Octave example in another SO question if you want to see more detail.

mtrw
Thanks for the overlap-save reference. I had read about it in Press et al., Numerical Recipes, with regards to time domain filtering and I wasn't sure how to map this to frequency domain. I am not sure about dropping: 1) why the second half of y_b rather than the ends, 2) in your other SO post, you drop the first half.
Paul
My filter h is derived from an average over raw data, with h(f) ~ 1/f and phases set to 0. I am filtering a synthetic signal with this filter to give it a spectrum more like my raw data. I am not sure how to design this filter in the time domain. One thing you pointed out is that ifft(h) better be zero at one end to avoid ringing artifacts. I will check this as it is very likely not. Is there an analogue to applying a Hamming window in the time domain to some window method in the frequency domain (your first example in your other SO post)?
Paul
Yeesh - sorry about screwing up the 1st half/2nd half issue. I updated with that correction, and some thoughts about generating a well behaved `h`.
mtrw
Are you by any chance trying to create random noise with a `1/f` frequency distribution? There are some techniques to generate such signals directly - see http://local.wasp.uwa.edu.au/~pbourke/fractals/noise/ for instance.
mtrw
Thanks very much for the code and illustrations. That is very helpful. I will have to read up using your tips. I have borrowed the book by Proakis. Regarding generation of 1/f distributions, I am doing something similar to the example 'Frequency Synthesis Of Landscapes' in your link as a test case, except that I am making a movie. The deterministic generator was the most interesting.
Paul
I've never done any 2D processing - I'd be curious what new issues you run into. Good luck!
mtrw
Because you asked: x = [1] + [t for t in range(1, N2+1)] + [t for t in range(N2-1, 0, -1)] is better written as hstack((1,arange(1,n2+1),arange(n2-1,0,-1))); hf = [1/(2*np.pi*t/N2) for t in x] as hf=1/(2*np.pi*x/2n)
tillsten
@tillsten - Thanks. Took me a while to figure out that hstack needs a tuple - at first I just thought you were getting carried away with the parentheses. Are these constructs "better" because they're faster?
mtrw
happy to help! i love numpy and still wonder why the various stack commands need a tuple. yes, it is faster then python list processing. maybe they are cases where it is fast to concatenate list then arrays, because the arrays are alinged in memory for fast access.
tillsten