views:

170

answers:

1

As in this example http://stackoverflow.com/questions/259451/how-to-extract-frequency-information-from-an-input-audio-stream-using-portaudio I'm curious about portaudio and numpy...

I'm not 100% sure about fft, how can I pass numpy a chunk and get back three values from -1.0 to 1.0 for bass, mid and treble ?

I don't mind if this just for one channel as I can make sense of the audio part of this, it's the maths that swim in front of me when I look at them :)

A: 

The Fourier Transform, mentioned in the selected answer to the SO question you point to, gives you the "spectrum" -- a large collection of values giving the sound intensity in each of various ranges/slices of frequencies (expressed, for example, in Hertz).

How to translate (say) a thousand intensities (one per each 10-Hertz slice of the spectrum, say) into just three numbers, as you desire, is of course quite a heuristic issue -- for example you could just decide which ranges of frequencies correspond to "bass" and "treble", with everything in-between being "mid", and compute the average intensities in each. For what it's worth, I believe a common convention for "bass" is up to 250Hz, for "treble" 6KHz and above (in-between being the "midrange"), cfr e.g. this page -- but it's rather an arbitrary convention, so, "pick your poison"!-)

Once you have the relative levels you'll want to normalize them with respect to each other and scale them appropriately to lie in your desired range (presumably on a logarithmic scale because that's how human hearing works;-).

Alex Martelli
Choosing the frequencies is a good point, for now it would be enough to use three equal sized ranges.I'd rather be able to get the 3 slices back from numpy instead of use python to convert the spectrum, for speed.As this is for graphics this may be enough, if not I can mess around with frequencies later. The main priority is performance and not doing too much processing in pure python.
Stuart Axon