tags:

views:

3165

answers:

5

I'm tasked with building a .NET client app to detect silence in a WAV files.

Is this possible with the built-in Windows APIs? Or alternately, any good libraries out there to help with this?

+4  A: 

http://www.codeproject.com/KB/cs/WAVE_Processor_In_C_.aspx

This has all the code necessary to strip silence, and mix wave files.

Enjoy.

FlySwat
+1  A: 

I don't think you'll find any built-in APIs for detection of silence. But you can always use good ol' math/discreete signal processing to find out loudness. Here's a small example: http://msdn.microsoft.com/en-us/magazine/cc163341.aspx

chitza
+4  A: 

Audio analysis is a difficult thing requiring a lot of complex math (think Fourier Transforms). The question you have to ask is "what is silence". If the audio that you are trying to edit is captured from an analog source, the chances are that there isn't any silence... they will only be areas of soft noise (line hum, ambient background noise, etc).

All that said, an algorithm that should work would be to determine a minimum volume (amplitude) threshold and duration (say, <10dbA for more than 2 seconds) and then simply do a volume analysis of the waveform looking for areas that meet this criteria (with perhaps some filters for millisecond spikes). I've never written this in C#, but this CodeProject article looks interesting; it describes C# code to draw a waveform... that is the same kind of code which could be used to do other amplitude analysis.

Simon Gillbee
+3  A: 

If you want to efficiently calculate the average power over a sliding window: square each sample, then add it to a running total. Subtract the squared value from N samples previous. Then move to the next step. This is the simplest form of a CIC Filter. Parseval's Theorem tells us that this power calculation is applicable to both time and frequency domains.

Also you may want to add Hysteresis to the system to avoid switching on&off rapidly when power level is dancing about the threshold level.

Mark Borgerding
A: 

Use Sox. It can remove leading and trailing silences, but you'll have to call it as an exe from your app.

Manu