After your FFT and filter, you need to do an inverse FFT to get the data back to the time domain. Then you want to add that set of samples to your .WAV file.
As far as producing the file itself goes, the format is widely documented (Googling for ".WAV format" should turn up more results than you have any use for), and pretty simple. It's basically a simple header (called a "chunk") that says it's a .WAV file (or actually a "RIFF" file). Then there's an "fmt " chunk that tells about the format of the samples (bits per sample, samples per second, number of channels, etc.) Then there's a "data" chunk that contains the samples themselves.
Since it sounds like you're going to be doing this in real time, my advice would be to forget about doing your FFT, filter, and iFFT. An FIR filter will give essentially the same results, but generally a lot faster. The basic idea of the FIR filter is that instead of converting your data to frequency domain, filtering it, then converting back to time domain, you convert your filter coefficients to time domain, and apply them (fairly) directly to your input data. This is where DSPs earn their keep: nearly all of them have multiply-accumulate instructions, which can implement most of a FIR filter in one instruction. Even without that, however, getting a FIR filter to run in real time on a modern processor doesn't take any real trick unless you're doing really fast sampling. In any case, it's a lot easier that getting an FFT/filter/iFFT to operate at the same speed.