tags:

views:

136

answers:

2

I have a ZyXEL USB Omni56K Duo modem and want to send and receive voice streams on it, but to reach adequate quality I probably need to implement some "ZyXEL ADPCM" encoding because plain PCM provides too small sampling rate to transmit even medium quality voice, and it doesn't work through USB either (probably because even this bitrate is too high for USB-Serial converter in it).

This mysterious codec figures in all Microsoft WAV-related libraries as one of many codecs theoretically supported by it, but I found no implementations.

Can someone offer an implementation in any language or maybe some documentation? Writing a custom mu-law decoding algorithm won't be a problem for me.

Thanks.

+2  A: 

I'm not sure how ZyXEL ADPCM varies from other flavors of ADPCM, but various ADPCM implementations can be found with some google searches.

However, the real reason for my post is why the choice of ADPCM. ADPCM is adaptive differential pulse-code modulation. This means that the data being passed is the difference in samples, not the current value (which is also why you see such great compression). In a clean environment with no bit loss (ie disk drive), this is fine. However, in a streaming environment, its generally assumed that bits may be periodically mangled. Any bit damage to the data and you'll be hearing static or other audio artifacts very quickly and usually, fairly badly.

ADPCM's reset mechanism isn't framed based, which means the audio problems can go on for an extended period of time depending on the encoder. The reset code is a usually a set of 0s (16 comes to mind, but its been years since I wrote my own ports).

ADPCM in the telephony environment usually converts a 12 bit PCM sample to a 4 bit ADPCM sample (not bad). As for audio quality...not bad for phone conversations and the spoken word, but most people, in a blind test, can easily detect the quality drop.

In your last sentence, you throw a curve ball into the question. You start mentioning muLaw. muLaw is a PCM implementation that takes a 12 bit sample and transforms it using a logarithmic scale to an 8 bit sample. This is the typical compression mechanism for TDM (phone) networkworks in North America (most of the rest of the world uses a similar algorithm called ALaw).

So, I'm confused what you are actually trying to find.

You also mentioned Microsft and WAV implementations. You probably know, but just in case, that WAV is just a wrapper around the audio data that provides format, sampling information, channel, size and other useful information. Without WAV, AU or other wrappers involved, muLaw and ADPCM are usually presented as raw data.

One other tip if you are implementing ADPCM. As I indicated, they use 4 bits to represent a 12 bit sample. They get away with this by both sides having a multiplier table. Your position in the table changes based on the 4 bit value (in other words, the value is both multiple against a step size and used to figure out the new step size). I've seen a variety of algorithms use slightly different tables (no idea why, but you typically see the sent and received signals slowly stray off the bias). One of the older, popular sound packages was different than what I typically saw from the telephony hardware vendors.

And, for more useless trivia, there are multiple flavors of ADPCM. The variances involve the table, source sample size and destination sample size, but I've never had a need to work with them. Just documented flavors that I've found when I did my internet search for specifications for the various audio formats used in telephony.

Jim Rush
Thanks for very nice description. So ADPCM is probably the wrong way... Question is closed.
whitequark
A: 

Piping your pcm through ffmpeg -f u16le -i - -f wav -acodec adpcm_ms - will likely work.

http://ffmpeg.org/

Will