views:

56

answers:

2

I want to compare two audio files(voice recording) and find whether they are identical or not (to some extent).I have come up with FFT(OouraFFT).I have integrated code and gave my audio file as input and "calculateWelchPeriodogramWithNewSignalSegment" is called.There is a term spectrum data used in "calculateWelchPeriodogramWithNewSignalSegment" method.now what should i use to compare two audio files.please anyone explain the concept for using FFT to compare two audio signal(speach signal).Further what should i proceed with?Any valuable information will be more helpful.Thanks in Advance.

EDIT:

 MyAudioFile  *audioFile = [[MyAudioFile alloc]init];
OSStatus result = [audioFile open:var ofType:@"wav"];
int numFrequencies=16384;
int kNumFFTWindows=10;

OouraFFT *myFFT = [[OouraFFT alloc] initForSignalsOfLength:numFrequencies*2 andNumWindows:kNumFFTWindows];
for(long i=0; i<myFFT.dataLength; i++)
{
    myFFT.inputData[i] = (double)audioFile.audioData[i];
} 
[myFFT calculateWelchPeriodogramWithNewSignalSegment];
NSLog(@"the spectrum data 1 is  %f ",myFFT.spectrumData[1]);
NSLog(@"the spectrum data 2 is  %f",myFFT.spectrumData[2]);
    NSLog(@"the spectrum data 8192 is  %f ",myFFT.spectrumData[8192]);

I have created MyAudioFile class which contains

-(OSStatus)open:(NSString *)fileName ofType:(NSString *)fileType{
OSStatus result = -1;

CFStringRef filePath=fileName;

CFURLRef audioFileURL = CFURLCreateWithFileSystemPath(kCFAllocatorDefault, (CFStringRef)filePath, kCFURLPOSIXPathStyle, false);
//open audio file
result = AudioFileOpenURL (audioFileURL, kAudioFileReadPermission, 0, &mAudioFile);
if (result == noErr) {
    //get  format info
    UInt32 size = sizeof(mASBD);

    result = AudioFileGetProperty(mAudioFile, kAudioFilePropertyDataFormat, &size, &mASBD);

    UInt32 dataSize = sizeof packetCount;
    result = AudioFileGetProperty(mAudioFile, kAudioFilePropertyAudioDataPacketCount, &dataSize, &packetCount);
    NSLog([NSString stringWithFormat:@"File Opened, packet Count: %d", packetCount]);

    UInt32 packetsRead = packetCount;
    UInt32 numBytesRead = -1;
    if (packetCount > 0) { 
        //allocate  buffer
        audioData = (SInt16*)malloc( 2 *packetCount);
        //read the packets
        result = AudioFileReadPackets (mAudioFile, false, &numBytesRead, NULL, 0, &packetsRead,  audioData); 
        NSLog([NSString stringWithFormat:@"Read %d  bytes,  %d packets", numBytesRead, packetsRead]);
    }
}
else
    NSLog([NSString stringWithFormat:@"Could not open file: %@", filePath]);


CFRelease (audioFileURL);     
return result;
}

I think ,now i am done with FFT , myFFT.spectrumData[i] has the sampled output differnt values of i.

Do i want now to stop this and integrate Accelerate framework for doing FFT.I am confused.Please tell me which one to use?

+2  A: 

I am not sure that FFT is what you would want to use in this scenario. FFT will provide you with the power spectral density (PSD) of the signal. This means that you will get a plot of signal power verses frequency. Notice there is no time in there. In other words, you would only be able to compare if to signals have the same frequency distribution, but not if there time domain signals are identical. For this I think you would want to use something more along the lines of a Cross-Correlation which measures the similarity of two wave forms over a given time and gives you value of how similar they are. There may be more sophisticated ways of doing this, but this is off the top of my head.

-Eric

Eric Seifert
+2  A: 

This is actually a pretty tough problem, but I would say that working in the frequency space is useful. Also, as the author of the OouraFFT library (the ObjC wrapper around Prof. Ooura's pretty old FFT implementation), I would recommend NOT using it if you can instead adopt Apple's Accelerate library. It's much faster, more accurate, and will be actively maintained. My library will not, I've switched entirely to Accelerate for my own work.

Anyhoo, it's useful to work in frequency space, because any small offset in the time-domain will cause you a lot of headaches when working with cross-correlations. If you instead do a short-time fourier transform, you can apply the methods published by the engineers of the Shazam iPhone app, which, at first glance, seems to be robust to this problem. Best of luck, you've got a lot of work ahead of you.

alexbw
@alexbw --> i have edited my post, please give your valuable comments now.
Warrior
I really have no more valuable comments. It looks like you're able to compute the FFT, which is a good first step. You now have to use the frequency data in an algorithm (either of your own devising or the method that I linked to above) to tell whether the two signals are the same. You've got the tools, now build the house.
alexbw