views:

236

answers:

2

I'm implementing a VOIP application that uses pure Java. There is an echo problem that occurs when users do not use headsets (mostly on laptops with built-in microphones).

What currently happens

The nuts and bolts of the VOIP application is just the plain datalines of Java's media framework. Essentially, I'd like to perform some digital signal processing on audio data before I write it to the speaker for output.

  public synchronized void addAudioData(byte[] ayAudioData)
  {
    m_oBuffer.enqueue(ayAudioData);
    this.notify();
  }

As you can see the audio data arrives and is enqueued in a buffer. This is to cater for dodgy connections and to allow for different packet sizes. It also means I have access to as much audio data as I need for any fancy DSP operations before I play the audio data to the speaker line.

I've managed one echo canceller that does work, however it requires a lot of interactive user input and I'd like to have an automatic echo canceller.

Manual echo canceller

public static byte[] removeEcho(int iDelaySamples, float fDecay, byte[] aySamples)
  {
    m_awDelayBuffer = new short[iDelaySamples];
    m_aySamples = new byte[aySamples.length];
    m_fDecay = (float) fDecay;
    System.out.println("Removing echo");
    m_iDelayIndex = 0;

    System.out.println("Sample length:\t" + aySamples.length);
    for (int i = 0; i < aySamples.length; i += 2)
    {
      // update the sample
      short wOldSample = getSample(aySamples, i);

      // remove the echo
      short wNewSample = (short) (wOldSample - fDecay * m_awDelayBuffer[m_iDelayIndex]);
      setSample(m_aySamples, i, wNewSample);

      // update the delay buffer
      m_awDelayBuffer[m_iDelayIndex] = wNewSample;
      m_iDelayIndex++;

      if (m_iDelayIndex == m_awDelayBuffer.length)
      {
        m_iDelayIndex = 0;
      }
    }

    return m_aySamples;
  }

Adaptive filters

I've read that adaptive filters are the way to go. Specifically, a Least Mean Squares filter. However, I'm stuck. Most sample code for the above are in C and C++ and they don't translate well into Java.

Does anyone have advice on how to implement them in Java? Any other ideas would also be greatly appreciated. Thanks in advance.

+1  A: 

This is a very complex area and to get a usable AEC solution working you'll need to do quite a bit of R&D. All the good AECs are proprietary, and there's a lot more to echo cancellation than just implementing an adaptive filter such as LMS. I suggest you develop your echo cancellation algorithm initially using MATLAB (or Octave) - when you have something that appears to work reasonably well with "real world" telecomms then you can implement the algorithm in C and test/evaluate it in real-time. Once this is working you can use JNI to call the C implementation from Java.

Paul R
Thanks for the reply. I've been trying to avoid using JNI but getting desperate enough to try anything.
Garg Unzola
You may well find that you need to do platform-specific stuff in your AEC, e.g. low level OS/audio API calls, so it might as well be C + JNI.
Paul R
A: 

In case anyone is interested, I managed to build a fair, working echo canceller by basically converting the above mentioned Acoustic Echo Cancellation method that uses a Normalised Least Means Square algorithm and a few filters from C into Java. The JNI route is probably still a better way to go, but I like sticking to pure Java if at all possible. By seeing how their filters work and reading up a great deal on filters on DSP Tutor, I managed to gain some control over how much noise gets removed and how to remove high frequencies, etc.

Some tips:

  1. Keep in mind what you remove from where. I had to switch this around a few times.
  2. The most important variable of this method is the convergence rate. This is the variable called Stepsize in the above link's code.
  3. I took the individual components one at a time, figured out what they did, built them and tested them separately. For example, I took the Double Talk Detector and tested it to ensure that it worked. Then I took the filters one by one and tested them on audio files to ensure that they worked, then I took the normalised least means square part and tested it before putting it all together.

Hope this helps someone else!

Garg Unzola