tags:

views:

27

answers:

2

My code using NAudio to read one particular MP3 gets different results than several other commercial apps.

Specifically: My NAudio-based code finds ~1.4 sec of silence at the beginning of this MP3 before "audible audio" (a drum pickup) starts, whereas other apps (Windows Media Player, RealPlayer, WavePad) show ~2.5 sec of silence before that same drum pickup.

The particular MP3 is "Like A Rolling Stone" downloaded from Amazon.com. Tested several other MP3s and none show any similar difference between my code and other apps. Most MP3s don't start with such a long silence so I suspect that's the source of the difference.

Debugging problems:

  1. I can't actually find a way to even prove that the other apps are right and NAudio/me is wrong, i.e. to compare block-by-block my code's results to a "known good reference implementation"; therefore I can't even precisely define the "error" I need to debug.

  2. Since my code reads thousands of samples during those 1.4 sec with no obvious errors, I can't think how to narrow down where/when in the input stream to look for a bug.

  3. The heart of the NAudio code is a P/Invoke call to acmStreamConvert(), which is a Windows "black box" call which I can't think how to error-check.

Can anyone think of any tricks/techniques to debug this?

A: 

The NAudio ACM code was never originally intended for MP3s, but for decoding constant bit rate telephony codecs. One day I tried setting up the WaveFormat to specify MP3 as an experiment, and what came out sounded good enough. However, I have always felt a bit nervous about decoding MP3s (especially VBR) with ACM (e.g. what comes out if ID3 tags or album art get passed in - could that account for extra silence?), and I've never been 100% convinced that NAudio does it right - there is very little documentation on how exactly you are supposed to use the ACM codecs. Sadly there is no managed MP3 decoder with a license I can use in NAudio, so ACM remains the only option for the time being.

I'm not sure what approach other media players take to playing back MP3, but I suspect many of them have their own built-in MP3 decoders, rather than relying on the operating system.

Mark Heath
Thanks for the background. So I won't expect too much from NAudio MP3 decoding, but it's still put me miles ahead of figuring it all out myself from scratch, and I still don't know a better way to decode MP3 for a low-budget commercial app.
Conrad Albrecht
A: 

I've found some partial answers to my own Q:

  1. Since my problem boils down to consuming too much MP3 w/o producing enough PCM, I used conditional-on-hit-count breakpoints to find just where this was happening, then drilled into that.

  2. This showed me that some acmStreamConvert() calls are returning success, consuming 417 src bytes, but producing 0 "dest bytes used".

Next I plan to try acmStreamSize() to ask the codec how many src bytes it "wants" to consume, rather than "telling" it to consume 417.

Edit (followup): I fixed it!

It came down to passing acmStreamConvert() enough src bytes to make it happy. Giving it its acmStreamSize() requested size fixed the problem in some places but then it popped up in others; giving it its requested size times 3 seems to cure the "0 dest bytes used" result in all MP3s I've tested.

With this fix, acmStreamConvert() then sometimes returned much larger converted chunks (almost 32 KB), so I also had to modify some other NAudio code to pass in larger destination buffers to hold the results.

Conrad Albrecht
acmStreamSize is great for CBR, but cannot provide a meaningful answer for VBR. also, not everything in an MP3 file is compressed audio. there is other stuff like ID3 tags etc. I try to remove them, but there is always a chance that some non-audio stuff gets passed in to the ACM codec. I haven't seen any official documentation on how you are supposed to parse an MP3 file to work out what bits of it ought to be sent to the ACM codec.
Mark Heath