tags:

views:

148

answers:

2

I am writing a C# function for doing dynamic range compression (an audio effect that basically squashes transient peaks and amplifies everything else to produce an overall louder sound). I have written a function that does this (I think):

alt text

public static void Compress(ref short[] input, double thresholdDb, double ratio)
{
    double maxDb = thresholdDb - (thresholdDb / ratio);
    double maxGain = Math.Pow(10, -maxDb / 20.0);

    for (int i = 0; i < input.Length; i += 2)
    {
        // convert sample values to ABS gain and store original signs
        int signL = input[i] < 0 ? -1 : 1;
        double valL = (double)input[i] / 32768.0;
        if (valL < 0.0)
        {
            valL = -valL;
        }
        int signR = input[i + 1] < 0 ? -1 : 1;
        double valR = (double)input[i + 1] / 32768.0;
        if (valR < 0.0)
        {
            valR = -valR;
        }

        // calculate mono value and compress
        double val = (valL + valR) * 0.5;
        double posDb = -Math.Log10(val) * 20.0;
        if (posDb < thresholdDb)
        {
            posDb = thresholdDb - ((thresholdDb - posDb) / ratio);
        }

        // measure L and R sample values relative to mono value
        double multL = valL / val;
        double multR = valR / val;

        // convert compressed db value to gain and amplify
        val = Math.Pow(10, -posDb / 20.0);
        val = val / maxGain;

        // re-calculate L and R gain values relative to compressed/amplified
        // mono value
        valL = val * multL;
        valR = val * multR;

        double lim = 1.5; // determined by experimentation, with the goal
            // being that the lines below should never (or rarely) be hit
        if (valL > lim)
        {
            valL = lim;
        }
        if (valR > lim)
        {
            valR = lim;
        }

        double maxval = 32000.0 / lim; 

        // convert gain values back to sample values
        input[i] = (short)(valL * maxval); 
        input[i] *= (short)signL;
        input[i + 1] = (short)(valR * maxval); 
        input[i + 1] *= (short)signR;
    }
}

and I am calling it with threshold values between 10.0 db and 30.0 db and ratios between 1.5 and 4.0. This function definitely produces a louder overall sound, but with an unacceptable level of distortion, even at low threshold values and low ratios.

Can anybody see anything wrong with this function? Am I handling the stereo aspect correctly (the function assumes stereo input)? As I (dimly) understand things, I don't want to compress the two channels separately, so my code is attempting to compress a "virtual" mono sample value and then apply the same degree of compression to the L and R sample value separately. Not sure I'm doing it right, however.

I think part of the problem may the "hard knee" of my function, which kicks in the compression abruptly when the threshold is crossed. I think I may need to use a "soft knee" like this:

alt text

Can anybody suggest a modification to my function to produce the soft knee curve?

+1  A: 

I think your basic understanding of how to do compression is wrong (sorry ;)). It's not about "compressing" individual sample values; that will radically change the waveform and produce severe harmonic distortions. You need to assess the input signal volume over many samples (I would have to Google for the correct formula), and use this to apply a much-more-gradually-changing multiplier to the input samples to generate output.

The DSP forum at kvraudio.com/forum might point you in the right direction if you have a hard time finding the usual techniques.

Conrad Albrecht
My understanding of range compression is somewhere between 0 and 100%. :) In this specific case, I can't do averaging (easily, at least) because I'm attempting to apply compression to separate chunks of a larger piece of audio and then fit the compressed chunks back together seamlessly. I think you're right, though, that I need to apply the multiplier with more gradual changes, or else compress the larger piece all at once.
MusiGenesis
I've checked out the KVR forums before. The main problem I face in finding code examples for this sort of thing is that nearly all the stuff I find is for real-time effects, whereas I'm trying to process existing WAV files.
MusiGenesis
I just thought of a way I can do averaging with my separate chunks. I'd really appreciate links to any good resources/code for this.
MusiGenesis
Well, it turns out the primary problem in my code (including the above sample) is that I got the negativity of the decibel values wrong, so I was actually *amplifying* the peaks instead of compressing them. I implemented the averaging before identifying the real problem, but it does seem to be evening out the effects - the smaller I make the averaging window, the more erratic the volume changes seem to be.
MusiGenesis
Now it's working fairly well, although it's still producing a small bit of distortion (which may be clipping). It's not compressing the sharp attack sounds as much as I want, however, which I think is because my averaging only incorporates prior samples. Since I can read ahead in my data source, I'm going to modify my method to incorporate "future" samples in the running average, so that the signal can be attenuated in advance of a sharp attack.
MusiGenesis
Interesting point. You can beat the response time of any real-time compressor by looking ahead.
Conrad Albrecht
Not sure what the "averaging" you keep referring to is, but a *mean* of sample values is not the correct formula for signal "energy" or "volume", although it might be a crude approximation in many cases.
Conrad Albrecht
Good point about averaging. I was actually using RMS with half of a Hanning window originally (so that closer samples are weighted more heavily than more distant ones), but this was relatively slow. A simpler running average (no RMS, no windowing) seems to work about as well, although I'm definitely not sure of that yet.
MusiGenesis
Sounds like your volume assessment was more sophisticated than I realized; I just couldn't tell from the comments.
Conrad Albrecht
+1  A: 

The open source Skype Voice Changer project includes a port to C# of a number of nice compressors written by Scott Stillwell, all with configurable parameters:

The first one looks like it has the capability to do soft-knee, although the parameter to do so is not exposed.

Mark Heath