tags:

views:

101

answers:

5
+1  Q: 

C file checksum

how can i make a checksum of a file using C? i dont want to use any third party, just default c language and also speed is very important (its less the 50mb files but anyway)

thanks

+1  A: 

I would recommend using a BSD implementation. For example, http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/cksum/

Brandon Horsley
+4  A: 
  1. Determine which algorithm you want to use (CRC32 is one example)
  2. Look up the algorithm on Wikipedia or other source
  3. Write code to implement that algorithm
  4. Post questions here if/when the code doesn't correctly implement the algorithm
  5. Profit?
Paul Tomblin
+1 just for "Profit?"
Chinmay Kanchi
+2  A: 

I would suggest starting with the simple one and then only worrying about introducing the fast requirement if it turns out to be an issue.

Far too much time is wasted on solving problems that do not exist (see YAGNI).

By simple, I mean simply staring an checksum character (all characters here are unsigned) at zero, reading in every character and subtracting it that to the checksum character until the end of the file is reached.

Something like in the following program:

#include <stdio.h>

unsigned char checksum (unsigned char *ptr, size_t sz) {
    unsigned char chk = 0;
    while (sz-- != 0)
        chk -= *ptr++;
    return chk;
}

int main(int argc, char* argv[])
{
    unsigned char x[] = "Hello_";
    unsigned char y = checksum (x, 5);
    printf ("Checksum is 0x%02x\n", y);
    x[5] = y;
    y = checksum (x, 6);
    printf ("Checksum test is 0x%02x\n", y);
    return 0;
}

which outputs:

Checksum is 0x0c
Checksum test is 0x00

That checksum function actually does both jobs. If you pass it a block of data without a checksum on the end, it will give you the checksum. If you pass it a block with the checksum on the end, it will give you zero (or non-zero if the checksum is bad).

This is the simplest approach and will detect most random errors. It won't detect edge cases like two swapped characters so, if you need even more veracity, use something like Fletcher or Adler.

Both of those Wikipedia pages have sample C code you can either use as-is, or analyse and re-code to avoid IP issues if you're concerned.

paxdiablo
A: 

Simple and fast

FILE *fp = fopen("yourfile","rb");
unsigned char checksum = 0;
while (!feof(fp) && !ferror(fp)) {
   checksum ^= fgetc(fp);
}

fclose(fp)
sizzzzlerz
A: 

Generally, CRC32 with a good polynomial is probably your best choice for a non-cryptographic-hash checksum. See here for some reasons: http://guru.multimedia.cx/crc32-vs-adler32/ Click on the error correcting category on the right-hand side to get a lot more crc-related posts.

R..