views:

318

answers:

3

Hello, hope you can help. I am recording audio from a microphone and streaming it live across a network. The quality of the samples is 11025hz, 8 bit, mono. Although there is a small delay (1 second), it works great. What I need help with is I am trying to now implement noise reduction and compression, to make the audio quieter and use less bandwidth. The audio samples are stored in a C# array of bytes[], which I am sending/receiving using Socket.

Could anyone suggest how, in C#, to implement compression and noise reduction? I do not mind using a third party library as long as it is free (LGPL license, etc) and can be utilized from C#. However, I would prefer actual working source code examples. Thanks in advance for any suggestion you have.

UPDATE:

I changed the bit size from 8 bit audio to 16 bit audio and the noise problem is fixed. Apprarently 8 bit audio from mic had too low signal-to-noise ratio. Voice sounds great at 11khz, 16 bit mono.

The requirements of this project have changed since I posted this, however. We are now trying to add video as well. I have a callback setup that receives live images every 100ms from a webcam. I need to encode the audio and video, mux them, transmit them on my socket to the server, the server re-transmits the stream to the other client, which receives the stream, demuxes the stream and decodes the audio and video, displays the video in a picture box and outputs the audio to the speaker.

I am looking at ffmpeg to help out with the (de|en)coding/[de]muxing, and I am also looking at SharpFFmpeg as a C# interop library to ffmpeg.

I cannot find any good examples of doing this. I have scoured the Internet all week, with no real luck. Any help you can provide is much appreciated!

Here's some code, including my call back function for the mic recording:

        private const int AUDIO_FREQ = 11025;
        private const int CHANNELS = 1;
        private const int BITS = 16;
        private const int BYTES_PER_SEC = AUDIO_FREQ * CHANNELS * (BITS / 8);
        private const int BLOCKS_PER_SEC = 40;
        private const int BUFFER_SECS = 1;
        private const int BUF_SIZE = ((int)(BYTES_PER_SEC / BLOCKS_PER_SEC * BUFFER_SECS / 2)) * 2; // rounded to nearest EVEN number

        private WaveLib.WaveOutPlayer m_Player;
        private WaveLib.WaveInRecorder m_Recorder;
        private WaveLib.FifoStream m_Fifo;

        WebCam MyWebCam;

        public void OnPickupHeadset()
        {
            stopRingTone();
            m_Fifo = new WaveLib.FifoStream();

            WaveLib.WaveFormat fmt = new WaveLib.WaveFormat(AUDIO_FREQ, BITS, CHANNELS);
            m_Player = new WaveLib.WaveOutPlayer(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferFillEventHandler(PlayerCB));
            m_Recorder = new WaveLib.WaveInRecorder(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferDoneEventHandler(RecorderCB));

            MyWebCam = null;
            try
            {
                MyWebCam = new WebCam();                
                MyWebCam.InitializeWebCam(ref pbMyPhoto, pbPhoto.Width, pbPhoto.Height);
                MyWebCam.Start();
            }
            catch { }

        }

        private byte[] m_PlayBuffer;
        private void PlayerCB(IntPtr data, int size)
        {
            try
            {
                if (m_PlayBuffer == null || m_PlayBuffer.Length != size)
                    m_PlayBuffer = new byte[size];

                if (m_Fifo.Length >= size)
                {
                    m_Fifo.Read(m_PlayBuffer, 0, size);
                }
                else
                {
                    // Read what we can 
                    int fifoLength = (int)m_Fifo.Length;
                    m_Fifo.Read(m_PlayBuffer, 0, fifoLength);

                    // Zero out rest of buffer
                    for (int i = fifoLength; i < m_PlayBuffer.Length; i++)
                        m_PlayBuffer[i] = 0;                        
                }

                // Return the play buffer
                Marshal.Copy(m_PlayBuffer, 0, data, size);
            }
            catch { }
        }


        private byte[] m_RecBuffer;
        private void RecorderCB(IntPtr data, int size)
        {
            try
            {
                if (m_RecBuffer == null || m_RecBuffer.Length != size)
                    m_RecBuffer = new byte[size];
                Marshal.Copy(data, m_RecBuffer, 0, size);

                // HERE'S WHERE I WOULD ENCODE THE AUDIO IF I KNEW HOW

                // Send data to server
                if (theForm.CallClient != null)
                {
                    SocketAsyncEventArgs args = new SocketAsyncEventArgs();
                    args.SetBuffer(m_RecBuffer, 0, m_RecBuffer.Length);
                    theForm.CallClient.SendAsync(args);
                }
            }
            catch { }
        }

        //Called from network stack when data received from server (other client)
        public void PlayBuffer(byte[] buffer, int length)
        {
            try
            {
                //HERE'S WHERE I WOULD DECODE THE AUDIO IF I KNEW HOW

                m_Fifo.Write(buffer, 0, length); 
            }
            catch { }
        }

So where should I go from here?

+1  A: 

Your goals here are kind of mutually exclusive. The reason your 11025Hz/8bit/Mono WAV files sound noisy (with a tremendous amount of "hiss") is because of their low sample rate and bit resolution (44100Hz/16bit/Stereo is the standard for CD-quality audio).

If you continue recording and streaming at that rate, you are going to have noisy audio - period. The only way to eliminate (or actually just attenuate) this noise would be to up-sample the audio to 44100Hz/16bit and then perform a noise reduction algorithm of some sort on it. This upsampling would have to be performed by the client application, since doing it on the server before streaming means you'd then be streaming audio 8X larger than your original (doing it on the server would also be utterly pointless, since you'd be better off just recording in the denser format in the first place).

What you want to do is to record your original audio in a CD-quality format and then compress it to a standard format like MP3 or Ogg Vorbis. See this earlier question:

http://stackoverflow.com/questions/203254/whats-the-best-audio-compression-library-for-net

Update: I haven't used this, but:

http://www.ohloh.net/p/OggVorbisDecoder

I think you need an encoder, but I couldn't find one for Ogg Vorbis. I think you could try encoding to the WMV format, as well:

http://www.discussweb.com/c-programming/1728-encoding-wmv-file-c-net.html

Update 2: Sorry, my knowledge level of streaming is pretty low. If I were doing something like what you're doing, I would create an (uncompressed) AVI file from the audio and the still images (using avifil32.dll methods via PInvoke) first, then compress it to MPEG (or whatever format is standard - YouTube has a page where they talk about their preferred formats, and it's probably good to use one of these).

I'm not sure if this will do what you need, but this link:

http://csharpmagics.blogspot.com/

using this free player:

http://www.videolan.org/

might work.

MusiGenesis
Thanks for your answer. Makes sense to sample at a higher quality and then do compression. I downloaded the source code for libogg and libvorbis and compiled them, so I have the DLL's. But I don't know how to use them in my C# app. Could you please point me to an example of usage by [DllImport] from C# to encode/decode my live audio stream buffer?
Rodney Burton
I couldn't find a C# Ogg encoder either. If I go the Ogg route, I will need a solution in C# that can encode AND decode Ogg Vorbis and Theora, because I'm now doing audio + video. Tough order, eh?
Rodney Burton
If you're doing audio *and* video, I'd say don't worry about the audio as a separate thing. Use something that encodes/decodes both audio and video (which is pretty much everything, including MPEG, WMV etc.).
MusiGenesis
Some questions I need help with: What audio codec would you use? What video codec? What file format? What 3rd party libraries would you use? What C# wrappers to those libraries? What functions within those libraries would you call to do live streaming?
Rodney Burton
I got the audio compression working with G729, and I am also trying out the Speex codec (because it is patent free). Still don't have a codec picked out for video yet, nor a file format. I've even thought about sending the audio and video on seperate ports, and not worry about a wrapper, but that could lead to a sync problem. The stream will be live conversations (web cam) so maybe it wouldn't be a problem. I am not trying to stream stored video files. I'll post more information when I have it. Thanks for your help so far.
Rodney Burton
Just to close this issue out, here's what I ended up doing. We said forget video for now. We'll add that later (two press releases are better than 1 anyways!). We used NAudio to capture the audio because found that it was more stable than the waveIn/waveOut was using. That had problems releasing unmanaged buffers in Vista and crashing intermittently. NAudio has not crashed! As far as the original issue goes, changing the bitsize from 8 bit to 16 bit fixed the b/g noise. We are still looking at implementing the Speex codec (because its free, no patent restrictions). Thx everyone for your help!
Rodney Burton
A: 

If you only want to compress the data to limit bandwidth usage you can try using a GZipStream.

Bloodsplatter
Rodney Burton
I fear that android is somewhat ill equipped for multimedia :)
Bloodsplatter
A: 

If we use GZipStream for compression how we can know that the size of byte array in the rxving side...

jMgmail.com