views:

866

answers:

4

So I want to work on this summer project to correct errors in a message transmission using Hamming Code, but I cannot figure out how it really works. I've read many articles online, but I don't really understand the algorithm. Can anybody explain it in simple terms?

Thanks.

A: 

The wikipedia article explains it quite nicely.

If you don't understand a specific aspect of the algorithm, then you will need to rephrase (or detail) your question, so that someone can address your specific part of the problem.

lothar
A: 

I may come off as a bit of an ass here, but the explanation found on Wikipedia is very straightforward. If you can't follow it, you're probably in over your head. Make sure that you understand how parity bits work.

Chad Birch
+1  A: 

Specifically from Wikipedia, the algorithm is as follows:

  1. Number the bits starting from 1: bit 1, 2, 3, 4, 5, etc.
  2. Write the bit numbers in binary. 1, 10, 11, 100, 101, etc.
  3. All bit positions that are powers of two (have only one 1 bit in the binary form of their position) are parity bits.
  4. All other bit positions, with two or more 1 bits in the binary form of their position, are data bits.
  5. Each data bit is included in a unique set of 2 or more parity bits, as determined by the binary form of its bit position.
    1. Parity bit 1 covers all bit positions which have the least significant bit set: bit 1 (the parity bit itself), 3, 5, 7, 9, etc.
    2. Parity bit 2 covers all bit positions which have the second least significant bit set: bit 2 (the parity bit itself), 3, 6, 7, 10, 11, etc.
    3. Parity bit 4 covers all bit positions which have the third least significant bit set: bits 4–7, 12–15, 20–23, etc.
    4. Parity bit 8 covers all bit positions which have the fourth least significant bit set: bits 8–15, 24–31, 40–47, etc.
    5. In general each parity bit covers all bits where the binary AND of the parity position and the bit position is non-zero.
Andy Mikula
+2  A: 

It's all about Hamming distance.

The Hamming distance between two base-2 values is the number of bits at which they differ. So if you transmit A, but I receive B, then the number of bits which must have been switched in transmission is the Hamming distance between A and B.

Hamming codes are useful when the bits in each code word are transmitted somehow separately. We don't care whether they're serial or parallel, but they aren't for instance combined into an analogue value representing several bits, or compressed/encrypted after encoding.

Thus, each bit is independently (at random with some fixed probability), either received correctly, or flipped. Assuming the transmission is fairly reliable, most bits are received correctly. So errors in a small number of bits are more likely, and simultaneous errors in large numbers of bits are unlikely.

So, a Hamming code usually aims to correct 1-bit errors, and/or to detect 2-bit errors (see the Wikipedia article for details of the two main types). Codes which correct/detect bigger errors can be constructed, but AFAIK aren't used as much.

The code works by evenly spacing out the code points in "Hamming space", which in mathematical terms is the metric space consisting of all values of the relevant word size, with Hamming distance as the metric. Imagine that each code point is surrounded by a little "buffer zone" of invalid values. If a value is received that isn't a code point, then an error must have occurred, because only valid code points are ever transmitted.

If a value in the buffer zone is received, then on the assumption that a 1-bit error occurred, the value which was transmitted must be distance 1 from the value received. But because the code points are spread out, there is only one code point that close. So it's "corrected" to that code point, on grounds that a 1-bit error is more likely than the greater error that would be needed for any other code point to produce the value received. In probability terms, the conditional probability that you sent the nearby code point is greater than the conditional probability that you send any other code point, given that I received the value I did. So I guess that you sent the nearby one, with a certain confidence based on the reliability of the transmission and the number of bits in each word.

If an invalid value is received which is equidistant from two code points, then I can't say that one is more likely to be the true value than the other. So I detect the error, but I can't correct it.

Obviously 3-bit errors are not corrected by a SECDED Hamming code. The received value is further from the value you actually sent, than it is to some other code point, and I erroneously "correct" it to the wrong value. So you either need transmission reliable enough that you don't care about them, or else you need higher-level error detection as well (for example, a CRC over an entire message).

Steve Jessop