views:

157

answers:

1

I have a situation where there is a corrupt WAV file from which I'm trying to recover data.

My colleagues have sliced up the large WAV file into smaller WAV files with proper headers. This has produced some interesting results.

Sliced into 1MB segments we get these results:

  • The first wave file segment is all noise.
  • The second wave file segment is distorted.
  • The third wave file segment is clear.

This pattern is repeated for the entire length of the file (after it's been broken into smaller files).

For 20MB slices:

  • The first wave file segment is all noise.
  • The second wave file segment is clear.
  • The third wave file segment is distorted.

Again, this pattern is repeated for the entire length of the file (after it's been broken into smaller files).

Would anyone know why this is occurring?

+2  A: 

Assuming the WAV contains uncompressed (raw) samples, recovery should be easy. You need to know the sample format. For example: 16 bits, two channels, 44100 Hz (which is cd quality). Because one of the segments is okay, then you can look at this to figure out what the right values are.

Then just open the WAV using these values in, e.g., Adobe Audition (formerly Cool Edit), or any other wave editor that supports import of raw data.

Edit: Okay, now to answer your question. Some segments are clear, because then the alignment is right. Take the cd quality again, as I described before. The bytes of one sample look like this:

left_channel_high | left_channel_low | right_channel_high | right_channel_low

(I'm not sure about the ordering here! But it's just an example.) So the first data byte had better be the most significant byte of the left channel, or else you'll end up with fragments of two samples being interpreted as one whole sample:

left_channel_low | right_channel_high | right_channel_low || left_channel_high
-------------------part of first sample------------------ || --second sample--

You can see that everything "shifted" here, which happens because the size of your file slices is not a multiple of the sample size in bytes.

If you're lucky, this just causes the channels to be swapped. If you're unlucky, high and low bytes get swapped. Interestingly, this does lead to kind-of recognizable, but severely distorted audio.

What puzzles me is that the pattern you report repeats in blocks of three. From the above, I'd expect either two or four. Perhaps you are using an unusual sample format, such as 24-bits (3 bytes)?

Thomas
i'll give that a try. we aren't 100% sure if the sample format b/c the initial header data is corrupt. so there's guess work involved here.
ct
Again, if you can get clear audio on one of the slices, that must be the correct sample format.
Thomas
i got more detail from my corworkers.so here is the sampling format:96000 Hz sample rate1 channel24 bit
ct
so yes it's a 24 bit/3 byte sample format. (they wanted DVD quality). thomas, you are *the man*! thank you so much. i suspected the sampling sizes they used just happened to fall out with this issue. getting the channel byte distribution figured out was my next question.
ct
Click the green check mark, make my day ;)
Thomas