views:

252

answers:

1

Hi everyone. I'm making some progress on taking a compressed (mp3) sound and saving it as PCM. In addition, I wanted to split the original file into chunks that are 2 seconds long, within the same process. I seem to be successful, but I am a little confused as to why.

As I read blocks of audio and write the files out, I check to see if I am about to write a chunk that would make my file exceed my 2 second limit. If so, I write enough to get to 2 seconds, close the file, and then open a new file and write the remainder into the new file, and then read more data. Something like this:

framesInTimedSegment += numFrames;
if ((framesInTimedSegment  > (2.0 * sampleRate)) && (j < 5)) {
    UInt32 newNumFrames = numFrames;
    numFrames = framesInTimedSegment - (2.0 * sampleRate);
    newNumFrames -= numFrames;
// Question A
    UInt32 segmentOffset = newNumFrames * numChannels * 2;
    error = ExtAudioFileWrite(segmentFile, newNumFrames, &fillBufList);
// Question B
       // handle this error!  We might have an interruption
    if (segmentFile) ExtAudioFileDispose(segmentFile);
    XThrowIfError(ExtAudioFileCreateWithURL(urlArray[++j], kAudioFileCAFType, &dstFormat, NULL, kAudioFileFlags_EraseFile, &breakoutFile), "ExtAudioFileCreateWithURL failed! - segmentFile");
    size = sizeof(clientFormat);
    XThrowIfError(ExtAudioFileSetProperty(segmentFile, kExtAudioFileProperty_ClientDataFormat, size, &clientFormat), "couldn't set destination client format"); 
    fillBufList.mBuffers[0].mData = srcBuffer + segmentOffset;
    fillBufList.mBuffers[0].mDataByteSize = numFrames * fillBufList.mBuffers[0].mNumberChannels * 2;
    framesInTimedSegment = numFrames;
}
error = ExtAudioFileWrite(segmentFile, numFrames, &fillBufList);

Here are my questions (I have tried to label the relevant line):

A: Is there a better way to find the offset into my buffer so I don't erroneously hard code some value in there? For example, is there a blessed way to get the data offset from frame number?

B: If ExtAudioFileWrite is doing the conversion from compressed to decompressed, then the data I am writing hasn't yet been decompressed (right?), so shouldn't I have to worry about playing with frame numbers and offsets when I am dealing with compressed data? Should I instead be converting the file first, either to a PCM file or into memory, and then split that PCM?

Thanks!

-mahboud

ps.

The clientFormat is defined as follows:

        clientFormat = dstFormat;

and dstFormat:

        dstFormat.mFormatID = outputFormat;
        dstFormat.mChannelsPerFrame = srcFormat.NumberChannels();
        dstFormat.mBitsPerChannel = 16;
        dstFormat.mBytesPerPacket = dstFormat.mBytesPerFrame = 2 * dstFormat.mChannelsPerFrame;
        dstFormat.mFramesPerPacket = 1;
        dstFormat.mFormatFlags = kLinearPCMFormatFlagIsPacked | kLinearPCMFormatFlagIsSignedInteger; // little-endian
+2  A: 

It's difficult to answer correctly without seeing a bit more code. But, assuming clientFormat is an interleaved PCM format:

B) ExtAudioFileWrite does not perform the conversion from compressed to decompressed, ExtAudioFileRead does- depending on what client format you have set. Assuming an MP3 source file and a "standard" 16-bit 44.1 KHz PCM client format, calls to ExtAudioFileRead will convert from the MP3 bytes to PCM data. This is done under the hood by using AudioFile and AudioConverter APIs.

A) This is a bit hard to answer without seeing how srcBuffer is defined (I assume an array of int16_t). If you are working with PCM data, what you are doing looks OK. You could also use newNumFrames * clientFormat.mBytesPerFrame * clientFormat.mChannelsPerFrame, but assuming 16-bit PCM data, mBytesPerFrame == mBytesPerPacket == 2. If you were working with non-CBR data you would need to concern yourself with packet descriptions, but that doesn't seem to be the case.

sbooth
Very good answer... The code above is the only changes I made to ExtAudioFileConvert.cpp from the Apple sample, "iPhoneExtAudiofileConvertTest", perhaps you are familiar with it.Tell me if I get this right: If I were reading MP3 and writing PCM, then the conversion is happening in the ExtAudioFileRead. If I were reading PCM and writing MP3, then the conversion is happening in ExtAudioFileWrite. Is that correct? I added the client format in the original question.
mahboudz
That's correct. ExtAudioFileRead will convert from the file's native format to the client format, and ExtAudioFileWrite from the client format to the file's output format.
sbooth
I hate to ask another question, but from Apple's sample it seems that interrupts on ExtAudioWrite are the ones to worry about, not on ExtAudioRead. Is that right? I may have to post this to the CoreAudio mailing list.
mahboudz
I don't have any core audio experience on the iPhone, sorry.
sbooth