views:

198

answers:

4

hi,

i am using the win32 waveform api's in a C# app to make a voip system. all is going well, however i need some way of compressing the audio data on the fly.

so basically the audio data comes into a 'record' buffer of size 150 bytes, and then this buffer is sent over udp, and at the remote end, the 150 bytes are received and put into a 'play' buffer.

so i need some way of compressing/decompressing the data just before the udp->send and just after the udp->recv. normal compression algorithms dont work with audio, including the .NET GZip class.

does anyone know of a library that i can use that will help me do this ?

thanks in advance...

A: 

The component you're looking for is more well-known as a coder/decoder, or codec, and there are many options when it comes to picking one.

unwind
would you care to venture one ?
+1  A: 

150 bytes is an unbelievably small buffer for audio data--less than 5 milliseconds for e.g. 16 KHz mono. I'm no expert but I think regardless of the compression scheme you choose, your compression ratio will suffer greatly for using such a small buffer. Besides that there is significant overhead for each packet you send.

That said, if you are sending speech data, take a look at Speex for lossy compression (I have found it very effective at compressing speech, but the sound quality is terrible for music.)

Qwertie
at 16khz, what buffer size would you suggest ? it is set to 150 because thats what skype does (watched with a udp sniffer), although i would image skype's buffers are larger than 150, but end up being 150 after compression.
+1 for speex. it's what flash is now using.
spender
I would suggest at least 20-30 milliseconds before compression or up to 1 KB before compression (if your compression is terrific you might be able to get to 150 bytes after compression, but I'm no expert). Larger blocks directly lead to higher latency but 20 ms extra latency isn't a big deal.
Qwertie
In summary, it's a tradeoff between compression and block size (= latency). You can have good compression or small blocks but it's hard to get both at once.
Qwertie
+1  A: 

I would think you'd want to batch up those 150-byte chunks to get better compression.
Although, even at small buffer sizes like that, you can still get some compression.

If the built-in GZipStream isn't working you could try the GZipStream that is included in DotNetZip. There is also a ZlibCodec class available in DotNetZip that implements the Codec pattern - this may facilitate compressing in 150-byte blocks.

Cheeso
A: 

As suggested above, I'd look into Speex. It's well supported, and now the defacto standard for Flash Player.

I assume that by the size you are setting your buffers that latency is an issue (the bigger the buffer, the bigger the latency), so don't go for a codec that has a high decompressed frame size, because it introduces high latency. This more or less rules out MP3... for voice at 5khz output sample rate (it wouldn't serve much purpose going higher), the minimum decompressed frame size is 576 samples, or ~100ms of data that must be encoded prior to send. This means a bothway latency of over 200ms before you've even considered the network part of the problem.

spender