tags:

views:

29

answers:

2

Hey guys,

I'm writing an iPhone app which should record the users voice, and feed the audio data into a library for modifications such as changing tempo and pitch. I started off with the SpeakHere example code from Apple:

http://developer.apple.com/library/ios/#samplecode/SpeakHere/Introduction/Intro.html

That project lays the groundwork for recording the user's voice and playing it back. It works well.

Now I'm diving into the code and I need to figure out how to feed the audio data into the SoundTouch library (http://www.surina.net/soundtouch/) to change the pitch. I became familiar with the Audio Queue framework while going through the code, and I found the place where I receive the audio data from the recording.

Essentially, you call AudioQueueNewInput to create a new input queue. You pass a callback function which is called every time a chunk of audio data is available. It is within this callback that I need to pass the chunks of data into SoundTouch.

I have it all setup, but the noise I play back from the SoundTouch library is very staticky (it barely resembles the original). If I don't pass it through SoundTouch and play the original audio it works fine.

Basically, I'm missing something about what the actual data I'm getting represents. I was assuming that I am getting a stream of shorts which are samples, 1 sample for each channel. That's how SoundTouch is expecting it, so it must not be right somehow.

Here is the code which sets up the audio queue so you can see how it is configured.

void AQRecorder::SetupAudioFormat(UInt32 inFormatID)
{
memset(&mRecordFormat, 0, sizeof(mRecordFormat));

UInt32 size = sizeof(mRecordFormat.mSampleRate);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareSampleRate,
                                          &size, 
                                          &mRecordFormat.mSampleRate), "couldn't get hardware sample rate");

size = sizeof(mRecordFormat.mChannelsPerFrame);
XThrowIfError(AudioSessionGetProperty(kAudioSessionProperty_CurrentHardwareInputNumberChannels, 
                                          &size, 
                                          &mRecordFormat.mChannelsPerFrame), "couldn't get input channel count");

mRecordFormat.mFormatID = inFormatID;
if (inFormatID == kAudioFormatLinearPCM)
{
    // if we want pcm, default to signed 16-bit little-endian
    mRecordFormat.mFormatFlags = kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked;
    mRecordFormat.mBitsPerChannel = 16;
    mRecordFormat.mBytesPerPacket = mRecordFormat.mBytesPerFrame = (mRecordFormat.mBitsPerChannel / 8) * mRecordFormat.mChannelsPerFrame;
    mRecordFormat.mFramesPerPacket = 1;
}
}

And here's part of the code which actually sets it up:

    SetupAudioFormat(kAudioFormatLinearPCM);

    // create the queue
    XThrowIfError(AudioQueueNewInput(
                                  &mRecordFormat,
                                  MyInputBufferHandler,
                                  this /* userData */,
                                  NULL /* run loop */, NULL /* run loop mode */,
                                  0 /* flags */, &mQueue), "AudioQueueNewInput failed");

And finally, here is the callback which handles new audio data:

void AQRecorder::MyInputBufferHandler(void *inUserData,
                                  AudioQueueRef inAQ,
                                  AudioQueueBufferRef inBuffer,
                                  const AudioTimeStamp *inStartTime,
                                  UInt32 inNumPackets,
                                  const AudioStreamPacketDescription *inPacketDesc) {
AQRecorder *aqr = (AQRecorder *)inUserData;
try {
        if (inNumPackets > 0) {
            CAStreamBasicDescription queueFormat = aqr->DataFormat();
            SoundTouch *soundTouch = aqr->getSoundTouch();

            soundTouch->putSamples((const SAMPLETYPE *)inBuffer->mAudioData,
                                   inBuffer->mAudioDataByteSize / 2 / queueFormat.NumberChannels());

            SAMPLETYPE *samples = (SAMPLETYPE *)malloc(sizeof(SAMPLETYPE) * 10000 * queueFormat.NumberChannels());
            UInt32 numSamples;
            while((numSamples = soundTouch->receiveSamples((SAMPLETYPE *)samples, 10000))) {
                // write packets to file
                XThrowIfError(AudioFileWritePackets(aqr->mRecordFile,
                                                    FALSE,
                                                    numSamples * 2 * queueFormat.NumberChannels(),
                                                    NULL,
                                                    aqr->mRecordPacket,
                                                    &numSamples,
                                                    samples),
                              "AudioFileWritePackets failed");
                aqr->mRecordPacket += numSamples;
            }
            free(samples);
        }

        // if we're not stopping, re-enqueue the buffe so that it gets filled again
        if (aqr->IsRunning())
            XThrowIfError(AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, NULL), "AudioQueueEnqueueBuffer failed");
} catch (CAXException e) {
    char buf[256];
    fprintf(stderr, "Error: %s (%s)\n", e.mOperation, e.FormatError(buf));
}
}

You can see that I'm passing the data in inBuffer->mAudioData to SoundTouch. In my callback, what exactly are the bytes representing, i.e. how do I extract samples from mAudioData?

A: 

You have to check that the endianess, the signedness, etc. of what you are getting match what the library expects. Use mFormatFlags of AudioStreamBasicDescription to determine the source format. Then you might have to convert the samples (e.g. newSample = sample + 0x8000)

Aleph
I finally discovered that SoundTouch was expecting the samples to be floats. Recompiling it with the int sample types works now. Thanks!
James Long
A: 

The default endianess of the Audio Queue may be the opposite of what you expect. You may have to swap upper and lower bytes of each 16-bit audio samples after record and before play.

sample_le = (0xff00 & (sample_be << 8)) | (0x00ff & (sample_be >> 8)) ;
hotpaw2
I thought of that too, and I found a flag I can pass in the format options to make it big endian instead of small endian so that I don't have to do it manually. That wasn't the issue, but thanks!
James Long