views:

211

answers:

7

I have a binary stream that has a very high error rate. The error rate is 50% meaning each bit has a 50% chance of being flipped. The error does not occur in bursts and is completely random, so Reed–Solomon codes wouldn't work well.

Which scheme or algorithm should I apply to the stream? I don't care about the overhead at all.

This is all theoretical, so there's no point in asking if I could just reduce the error of the stream.

EDIT

Don't say its not possible, the very first answer it tells you it is possible with noisy channel coding theorem.

+3  A: 

Well, the whole point of Reed-Solomon error correction is that most real-world errors occur in bursts, so you interleave and de-interleave the data. If your errors are completely random, i.e. Poisson-distributed, then just adding redundancy to the stream in a straightforward, mathematically efficient way will work. One thing you could look at is some kind of hidden Markov model, such as trellis code. This is basically just a mathematically efficient way of adding redundancy.

Also, have a look at the noisy channel coding theorem. Strictly speaking, it doesn't apply to digital data, but if your source of these bits is some analog process, or if you could model your bits as if they were the result of some analog process, it could give you some insight into what the best you could do might be. This would prevent you from wasting time trying to do better than is mathematically possible.

dsimcha
That theorem applies very well to this problem. It'll just mean the data rate will be very limited
Pyrolistical
A: 

Have you looked into turbo codes?

-- MarkusQ

Doh! I misread that as 50% randomized, not 50% flipped.

MarkusQ
+8  A: 

If the error rate is 50%, then that's basically random noise isn't it? I mean, consider just trying to transmit a single bit. If you send an infinite stream of the right bit, with a 50% error rate you'll get half 1s and half 0s whether the right bit is 1 or 0.

If it's actually less than 50% (e.g. 50% of the bits will be "random" rather than "flipped") then you could just repeat the data - transmit each bit 128 times and work out which you get more of for each 100 bits received. That's the simple-to-code, hugely inefficient, not mathematical at all solution :)

Jon Skeet
Maybe by 'flipped' he means 'may be flipped'. That would make it possible.
Strilanc
@John: 50% and higher error rates can still be handled by the simple scheme you suggest. With high probability, sending 128 1-bits will result in 0-64 0-bits and 64-128 1-bits, with the most probable outcome being 32 0-bits and 96 1-bits. For better error rates, just use more repeats.
j_random_hacker
@j_random_hacker: That's the difference between "50% of bits will be flipped" and "50% of bits will be random".
Jon Skeet
Oops, I'm assuming that by "50% error rate" he means that 50% of the bits are replaced with random bits. Otherwise you're right, as there's no way to make the expected value of a received bit anything other than 0.5 (assuming there's no clustering of errors).
j_random_hacker
+2  A: 

As the channel approaches 50% real noise rate, it no longer becomes possible to transmit any information at all. To Jon Skeet's answer, if the error rate is anything less than 50% noise, then you can get data through by doing longer bursts of the intended data redundantly and statistically looking at the result to some level of confidence in the original value. The needed burst length and confidence levels for a given length would then be derived based on a characterization of the noise. Understand, however, what you are doing here is effectively lowering the data rate to improve the net Signal to Noise Ratio of the transmitted stream.

In your question, you might have ruled this out as an option, but a better encoding scheme might be based on the relative existence (or not) of the data stream itself. In other words, to transmit a binary one....send an alternating stream of 1/0. To send a zero, send nothing or perhaps send a constant level. The idea is that sending (and receiving) anything represents one state and sending (and receiving) nothing represents the other state. This would effectively resemble a type of bipolar encoding of the data.

Tall Jeff
+1  A: 

If your error rate is 50% the bit stream IS random and bears NO CORRELATION to the original bit stream. It is like you are XORing the stream with a completely random bit stream, and the result is completely random. And there is nothing you can do about it.

The flip rate must be below 50% in order for any scheme to work. Of course, it could be ABOVE 50%, but then you can first invert the stream and then process it like if the error rate was below 50%.

If the errors are completely random and very frequent (e.g. 25% of the bits are flipped), it is very difficult to come up with a robust error detection scheme. You need to add a significant amount of redundancy.

antti.huima
You made a contradiction. If 50% of the bits are random, you can't say the entire stream is random, since you stated only 50% of the bits are random.
Pyrolistical
Indeed, but how do you tell *which* of the bits are random?
Piskvor
I think there is some confusion over "random" vs "incorrect." If 50% of the bits are random, then the stream still has useful information. If 50% of the bits are incorrect, it does not.
mquander
If 50% of the bits are FLIPPED then the WHOLE stream is random. If 50% of the bits are random, this means that 25% of all the bits are flipped.
antti.huima
A: 

If exactly 50% of the bits are flipped in any given transmission, rather than each bit being flipped with 50% probability, you can send a bit of information by sending a transmission of two bits -- send a 0 as 00 and a 1 as 01. If the first bit of the received codeword is 1, then the other bit is unflipped.

Captain Segfault
+1  A: 

The noisy-channel coding theorem says you can actually achieve the Shannon capacity for the channel. It does not say the channel has nonzero capacity!

If you randomize 100% of the bits in the channel, 50% of them will be unchanged, so you only flip a random 50% of the bits. It should be obvious that you can't send any data over such a channel -- its Shannon capacity is zero.

Captain Segfault