views:

691

answers:

4

Ok what im trying to do is a kind of audio processing software that can detect a prevalent frequency an if the frequency is played for long enough (few ms) i know i got a positive match. i know i would need to use FFT or something simiral but in this field of math i suck, i did search the internet but didn not find a code that could do only this.

the goal im trying to accieve is to make myself a custom protocol to send data trough sound, need very low bitrate per sec (5-10bps) but im also very limited on the transmiting end so the recieving software will need to be able custom (cant use an actual hardware/software modem) also i want this to be software only (no additional hardware except soundcard)

thanks alot for the help.

+1  A: 

While I haven't tried audio processing with Python before, perhaps you could build something based on SciPy (or its subproject NumPy), a framework for efficient scientific/engineering numerical computation? You might start by looking at scipy.fftpack for your FFT.

Karmastan
ok i found this http://www.swharden.com/blog/2010-03-05-realtime-fft-graph-of-audio-wav-file-or-microphone-input-with-python-scipy-and-wckgraph/ tho now i wonder how will i find the freq range that is at the highest (also the SciPy helped a bit thanks
Tsuki
+2  A: 

The aubio libraries have been wrapped with SWIG and can thus be used by Python. Among their many features include several methods for pitch detection/estimation including the YIN algorithm and some harmonic comb algorithms.

However, if you want something simpler, I wrote some code for pitch estimation some time ago and you can take it or leave it. It won't be as accurate as using the algorithms in aubio, but it might be good enough for your needs. I basically just took the FFT of the data times a window (a Blackman window in this case), squared the FFT values, found the bin that had the highest value, and used a quadratic interpolation around the peak using the log of the max value and its two neighboring values to find the fundamental frequency. The quadratic interpolation I took from some paper that I found.

It works fairly well on test tones, but it will not be as robust or as accurate as the other methods mentioned above. The accuracy can be increased by increasing the chunk size (or reduced by decreasing it). The chunk size should be a multiple of 2 to make full use of the FFT. Also, I am only determining the fundamental pitch for each chunk with no overlap. I used PyAudio to play the sound through while writing out the estimated pitch.

Source Code:

# Read in a WAV and find the freq's
import pyaudio
import wave
import numpy as np

chunk = 2048

# open up a wave
wf = wave.open('test-tones/440hz.wav', 'rb')
swidth = wf.getsampwidth()
RATE = wf.getframerate()
# use a Blackman window
window = np.blackman(chunk)
# open stream
p = pyaudio.PyAudio()
stream = p.open(format =
                p.get_format_from_width(wf.getsampwidth()),
                channels = wf.getnchannels(),
                rate = RATE,
                output = True)

# read some data
data = wf.readframes(chunk)
# play stream and find the frequency of each chunk
while len(data) == chunk*swidth:
    # write data out to the audio stream
    stream.write(data)
    # unpack the data and times by the hamming window
    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                         data))*window
    # Take the fft and square each value
    fftData=abs(np.fft.rfft(indata))**2
    # find the maximum
    which = fftData[1:].argmax() + 1
    # use quadratic interpolation around the max
    if which != len(fftData)-1:
        y0,y1,y2 = np.log(fftData[which-1:which+2:])
        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
        # find the frequency and output it
        thefreq = (which+x1)*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    else:
        thefreq = which*RATE/chunk
        print "The freq is %f Hz." % (thefreq)
    # read some more data
    data = wf.readframes(chunk)
if data:
    stream.write(data)
stream.close()
p.terminate()
Justin Peel
wow great thanks, this looks like will do now i only gota figure how to read the audio real time from auido input (microphone)
Tsuki
Go the PyAudio site http://people.csail.mit.edu/hubert/pyaudio/ and scroll down the page to the examples. You'll see some that take input from the microphone.
Justin Peel
uhm can u help me figure why is this error happening:"need more than 0 values to unpack"on the following line "y0,y1,y2 = np.log(fftData[which-1:which+2:])"
Tsuki
Yeah, that was kind of buggy there. I've fixed it. The problem was that if which was = to 0 or the last value of fftData, then it wouldn't return 3 values there. We don't want the value in the 0 bin of fftData anyway (it is the DC offset).
Justin Peel
A: 

Hey guys I have a question. Im a beginner to python. Recently created a forum for a class project for university. I wanna get into more of wat i love which is audio. I want to develop a noise cancelling tool which takes the input from the mic, and shifts the phase of all frequencies by 180 degrees and plays it to the speaker in real time. Is this possible with python?

Lochana
Uhm, try open a new question, is doable, maybe with a bit different approach but yeh. Im interested in this too :)
Tsuki
A: 

looks nice. is there a way of accurate frequency detection (let say with 524288 pts resolution) of maximum peaks in user area? let say, that you have a wave (could be mono) file in which you specify the lower and upper frequency limit and the peak detection level. do you know how to automate such proces of peaks detection in multiple files, and how to store the search results in txt/csv file?

happy