views:

247

answers:

2

I'm looking to change the speed of a sound file, but am at a loss as to how to accomplish it. I'm assuming that some type of interpolation has to take place in the case of slowing it down, but am unsure how to accomplish a speed up - perhaps an average of several samples? Whether it changes the tempo or pitch doesn't really matter at the moment, I'd like to learn how to accomplish both, but would like to at least accomplish one or the other to begin.

If anyone has any references to the math behind these types of operations, they would be greatly appreciated!

Thanks, Ben

+4  A: 

There are two options to speed up the playback of a sound file:

  • Increase the sample rate
  • Reduce the number of samples per unit of time.

In either of these methods, the increase in playback speed will have a corresponding change in the pitch of the sound.

Increasing the sample rate

Increasing the sample rate will increase the playback speed of the sound. For example, going from a 22 KHz sampling rate to 44 KHz will make the playback sound twice as fast as the original. In this method, the original sampling data is unaltered -- only the audio playback settings need to be changed.

Reduce the number of samples per unit of time

In this method, the playback sampling rate is kept constant, but the number of samples are reduced -- some of the samples are thrown out.

The naive approach to make the playback of the sound be twice the speed of the original is to remove every other sample, and playback at the original playback sampling rate.

However, with this approach, some of the information will be lost, and I would expect that some artifacts will be introduced to the audio, so it's not the most desirable approach.

Although I haven't tried it myself, the idea of averaging the samples to create a new sample to be a good approach to start with. This would seem to mean that rather than just throwing out audio information, it can be "preserved" to an extent by the averaging process.

As a rough idea of the process, here's a piece of pseudocode to double the speed of playback:

original_samples = [0, 0.1, 0.2, 0.3, 0.4, 0.5]

def faster(samples):
    new_samples = []
    for i = 0 to samples.length:
        if i is even:
            new_samples.add(0.5 * (samples[i] + samples[i+1]))
    return new_samples

faster_samples = faster(original_samples)

I've also posted an answered to the a question "Getting started with programmatic audio" where I went into some details about some basic audio manipulation that can be performed, so maybe that might be of interest as well.

coobird
The only advantage to averaging that I can think of is that you minimize bias. For example, if you had data like this: [1,-1,1,-1,1,-1] which has a zero bias, you can see that averaging would keep the zero bias but throwing out samples would not. Just about ANYTHING you do to a signal introduces some kind of artifact, as you'll see if you pick up a good DSP textbook.
Nosredna
@Nosredna: That's a really good example to illustrate the point about bias. Thank you for pointing it out.
coobird
+3  A: 

There is a good explanation about sample rate conversion on Wikipedia. Basically you convert your signal to a least common multiple of the two sample rates, filter out any frequencies that don't fit in the target sample rate (or didn't come from the source) and pick new samples at the target samplerate. There are mathematical tricks to make the computation take drastically less resources (polyphase decomposition) but this should get you started.

Ants Aasma