views:

95

answers:

4

The Levelator is a program that you feed an audio file and it generates another one with a more constant volume ensuring that any recording problems (like a person sounding too loud, or being barely audible) are corrected.

Do you know any libraries that I could use .Net in Windows to perform the same task? Or a command-line program would be good enough too.

A: 

The technique you're looking for is called audio normalization. This third-party code, Mp3SoundCapture, provides a way to do it, but it's a separate app, not a library.

John Feminella
I haven't downloaded that project so I'm not sure about the API - but it looks like a capture program, not one that massages an already recorded file.
Egor
It is, but my point was that you could look at how they did it and adopt it for your needs.
John Feminella
A: 

There are two main ways to approach this problem:

  1. Normalization this simply involves searching for the loudest part of the audio, then amplifying the whole file so that the loudest part goes to maximum volume. This technique is only useful if the loudest part is 50% volume or less. If you have a single spike somewhere in your input file that hits max volume, then normalization does nothing for you.

  2. Compression / Limiting this takes a slightly different approach and is used extensively in music recording. The basic idea is that any sound over a certain volume (called the 'threshold') gets made quieter (or in the case of the limiter, no sound is allowed through over a certain volume). This has the effect of evening out the volume of the whole recording (the quiet bits stay the same, and the loud bits get quieter). Then you are able to amplify the whole signal without distorting it (this is called make-up gain). See this article on dynamic range compression for more info.

As for implementing this in .NET, NAudio will let you view the samples in an input WAV file, allowing you to create your own normalization effect. I have also demonstrated in Skype Voice Recorder, how you can implement a compressor in .NET.

The final thing you should be aware of is that these algorithms only work if you have access to the sample values. So if, for example, your file is MP3, you need to first convert to PCM, then apply the normalization / compression, and finally convert back to MP3.

Mark Heath
+1  A: 

A command line program that does this is sox.

The general idea with the algorithm is to find the highest absolute value sample (audio should be centered, whatever the measurement of the sampled data).

You divide your maximum possible value by this number (which is guaranteed to be equal or smaller), and then you multiply that by your desired peak level (ie. do you want it to reach .95 of max? full 1.0?). If the result is not one, it becomes your scale value. Then you iterate through your file and multiply every sample by that number.

For example with CD quality audio your highest possible absolute value for a sample is 32767 (fudging this to make the example easier, the real range is -32768 to 32767, but treating 32767 as your max makes things much simpler here), so if you scanned through and the highest absolute value you found was 18000, than your amplification factor will be 1.8203888... , and if you want your max volume to be 0.9887997070223*the max availible, that gives you a new scale factor of 1.8. So you loop through the array holding the audio file, and replace the previous value for each sample with the value * 1.8.

This can be optimized by doing a click filter first, to eliminate spurious transients, and also by de-essing, which makes sure the waveform is evenly centered around the median value by removing low frequency components that cannot be produced by speakers or heard by the human ear. The click filter is a lowpass, and the de-esser is a highpass. Once these filters are run, there will be more room for amplifying the sound without introducing distortion.

Justin Smith
Thanks for the explanation Justin; I'll look into Sox. Something Levelator does is take the audio to a absolute volume. If I take two sound files, one with very high volume, and one with very low, and I levelate them independently, they sound at being the same level both of them. Is that something Sox can do?
J. Pablo Fernández
Yes. You have to call it twice. First you tell it to analyze, it returns an amplitude scale that would make it maximum volume. Then you adjust the volume, based on the previous return value (and of course you can adjust that return value if you are mixing with another file and want it a bit quieter or whatever). First you would do "sox -n <input-file> stat -n", and then based on the output of that, "sox <input-file> -v <number> <output-file>"
Justin Smith
A: 

In fact, the Levelator is neither a compressor nor a normalizer. Yes, it normalizes, but it does much more and has a lot more smarts than what you can do with sox, etc. Think of it as the hand on a fader that knows in advance what will happen and will even know when to leave well enough alone. Check out the algorithm discussion here: http://www.conversationsnetwork.org/levelatorAlgorithm

...doug (Levelator co-creator)

Doug Kaye
kinda like replaygain, but done as a compression step
Spudd86
We would pay money to license Levelator as a library (if we can afford it), it's a pity you guys don't do it.
J. Pablo Fernández