views:

73

answers:

1

So I'm stuck with a protocol that adds a CRC8/CRC16 over odd number of bits. (ie. it's not divisible by 8) What's the best method to generate the CRC for it in software?

There are plenty of CRC algorithm that uses table, but they are lookup per byte. Of course, there's the "fail-safe" of doing it one bit at a time. But is there a better approach? Perhaps doing it mostly by table lookup and then finish it doing a bit at a time?

I'm currently using a bitarray in python to handle this. But solution in C would also work. Thanks!

EDIT: Note that I'm interfacing with existing hardware that calc the CRC over the odd number of bits. (It's easy for the HW, since they just use a LFSR--1 bit at a time!) So while padding with known pattern would work for sake of integrity checking, it would break the hw compatibility.

+3  A: 

Padding with zeros at the front should not change the result. Computing the CRC is essentially binary long division. Unfortunately this involves splitting each byte. This is easy to with shift operators and bitwise or.

Zero padding at the end is, much easier, and depending on your reason for computing the CRC, a completely reasonable thing to do. For example, if you are using CRC for an integrity check.

Edit Taking my example from my comment. If you have 11 bits 11101110111 and want to compute the CRC, pad them to get 00000111 01110111 = 0x777, do not pad them to get 0x7770 as this will have a different CRC.

The reason that this works is that CRC is essentially binary long division

                    1 0 1 = 5
            -------------
1 0 0 1 1 / 1 1 0 1 1 0 1
            1 0 0 1 1 | |
            --------- | |
              1 0 0 0 0 |
              0 0 0 0 0 |
              --------- |
              1 0 0 0 0 1
                1 0 0 1 1
                ---------
                  1 1 1 0 = 14 = remainder

Has exactly the same result as

                      1 0 1 = 5
            ---------------
1 0 0 1 1 / 0 1 1 0 1 1 0 1
              1 0 0 1 1 | |
              --------- | |
                1 0 0 0 0 |
                0 0 0 0 0 |
                --------- |
                1 0 0 0 0 1
                  1 0 0 1 1
                  ---------
                    1 1 1 0 = 14 = remainder

and similarly for any number of leading zeros.

Note at this point, unless you are a psychiatrist looking for field work, want to become one, or secretly desire to need to see one, it may be worth your while to skip to the Super double secret probationary edit

Further Edit Due to question change

If you have a nontrivial initial vector, you can do the following. Say we want to compute the CRC-CCITT CRC of the above string with an initializer of FFFF. We pad the string to get 0x0FFF compute the CRC-CCIT with initializer 0 to get 0x0ECE, then compute the CRC-CCIT with initializer 0xFFFF of 0x0000 to get 0x1D0F, and xor them 0x0ECE xor 0x1D0F = 0x13C1.

The CRC of an arbitrary string of 0's and a nonzero initializer can be computed quickly if the polynomial is primitive (I think they all are), but it gets complicated and I do not have nearly enough time.

The essence of the technique is that we can consider the state of the shift register as a polynomial. If we initialize it with n ones this is the same as considering the initial polynomial as p(x) = x^(n - 1) + x^(n - 2) ... + x + 1. Computing the CRC of a string of k zeros is equivalent to finding p(x) x^k mod CRC. x^k mod CRC is easily found by repeated squaring and reduction. Any library for polynomial arithmetic over GF(2) should do this.

Even further Edit It probably makes more sense in the case of nonzero initializers to pad with zeros and change the initializer to a value such that after reading |pad| number of zeros the shift register contains FFFF (or whatever the value you wanted was. These can be precomputed, and you only need to store 16 or 32 of them (or howver many bits are in your crc polynomial.

For example with CRC-CCIT with initializer 0xFFFF and a single bit 0 padding we will want to use an initializer of 0xF7EF. These can be computed by finding x^(-1) mod CRC using the extended euclidean algorithm and then computing initializer * x^(-k) mod CRC for the various padding lengths. Again any GF(2) polynomail package should make this easy. I have used NTL in the past and found it quite flexible, but it is probably overkill here. Even for 32 bit crcs exhjaustive search will probably find the initializers faster than you can write the code.

Super double secret probationary edit

Ok, Things are actually considerably simpler than I thought they were. The general idea above is correct, we want to pad the string with 0's at the front to extend the size to a multiple of 8, 16 or 32 depending on what our software implementation wants, and we want to change our initial vector to set our state to something that after reading the padding zeros will the LFSR will be set to the initial vector that we wanted. We certainly could use galois field arithmetic to do this, but there is an easier way: just run the LFSR backwards.

For example if we want to compute the CRC-CCITT (0xFFFF) of the 11 bits 11 bits 11101110111 we pad them with 5 0's to get 00000111 01110111 and then back the LFSR up five spaces to get an initial vector of 0xF060. (I've done the computation by hand, so beware). So if you start an LSFR (or a software implementation) with IV of 0xF060 and run it on 0x0fff, you should get the same result as running an LFSR with IV 0xFFFF on the original 11 bits.

deinst
I don't think zero/one padding would work for CRC. (checksum yes, but CRC, no). If that was true, the CRC of 0x001122 /0x112200 would equal CRC of 0x1122. This doesn't seems to be so from quick check on online crc calc. http://www.lammertbies.nl/comm/info/crc-calculation.html
fseto
the CRC of 0x001122 should equal the checksum of 0x1122, but not equal the checksum of 0x112200. And this is what the calculator returns.
deinst
This is why you want to pad at the front not the back. If you have 11 bits 11101110111 you want to encode 00000111 01110111, not 1110111 011100000. Since you likely have the first bit byte aligned, adding 0's at the head will involve shifting.
deinst
deinst: I think you're on the right track... because it only holds true for some of the CRC polynomials. ie. if we're using CRC-CCITT w/ FFFF initializer, padding with zero's no longer works.
fseto
Could you define what an initializer is/does. I might note that your question asks for computing CRC-8 and CRC-16. Changing the question in midstream is a good way to make people think that you are a jerk.If initializers are initial settings of the remainder buffer, I believe that the CRC with an initializer of ffff whould be equivalent to the xor of the crc of padded string with initializer 0 and the crc of an empty string of the same length with initializer ffff, but I do not have time to do the algebra.
deinst
Excellent, thank you for your patience and your explanation! To summarize, we'll need to add padding 0's to the beginning and use a pre-computed initializer. By the way, my intention was not to change the question midstream, but rather clarify which one of the CRC-16 we're using. In my mind, since there are no official standard for CRC8/16, it can mean any polynomial/initializer. Sorry if it lead to any confusion.
fseto
Yes. When I get home, I'll run some mathematica to get initializers and show how the arithmetic works. With 8 and 16 bit crc's exhaustive search is certainly feasible, and probably even with 32 bits.
deinst