views:

259

answers:

7

how to translate this piece of C code into Python >=2.6 ?

unsigned long memSum(unsigned char *p, unsigned long len)
{
   unsigned long i, sum=0;

   for(i=0; i<len; i++) 
      sum = sum + *p++;

   return sum;
}  

of course

f=open("file_to_sum",'rb')
m = f.read()
f.close()
sum( array.array('B', m) )

does not work

+3  A: 

If you need to wrap around on overflow, simply take your sum modulo MAX_LONG at the end.

Anon.
lclevy
What "bad result" is this? Are we supposed to guess what you're thinking?
Anon.
A: 

As I mentioned in my comment, you need to convert the string into a list of ints. This probably what you want:

f=open("infile.txt",'rb')
m=f.read()
f.close()
m=map(ord,list(m))
print sum(m)

The ord function returns the ascii int corresponding to a character.

MAK
A: 

I've tried your code, and seems to be working (it opens the data in binary, convert it to a list of unsigned char and adds all).

What is your problem? Could be an overflow problem? Maybe there is a problem with the length? How are you saving the file? Sorry, but with this information we can only guess!

The code is not equivalent, as the Python code seems to deal with files and the C code seems to deal with a memory array.

Khelben
+2  A: 

A direct, Pythonic translation:

def memSum(data):
    return sum(ord(c) for c in data) & 0xFFFFFFFF
Mark Tolonen
A: 

You really need to give an example input and what you expect as output. Any code that uses a recent version of Python, extracts an integer in range(256) from each byte, sums those integers, and finally does total &= 0xFFFFFFFF should do the job (assuming that your unsigned long is 32 bits wide).

Note that the last step ( the &=) is pointless if your file is less than about 16MB in size ... it won't overflow; 16843009 * 255 <= 0xFFFFFFFF < (16843009 + 1) * 255

That means that if your test file is smaller than 16843010 bytes, you must have a problem in your C code or your Python code or both.

You said that "of course" this code:

f=open("file_to_sum",'rb')
m = f.read()
f.close()
sum( array.array('B', m) )

"doesn't work". Does it work if you replace the last line by

print sum( array.array('B', m) )

?

If none of the above is of any help, and you want sensible answers instead of guesses, provide example input, expected output, C code, C output, Python code, Python output. Both the C code and the Python code should be standalone-runnable, and should include printing the size of the byte array being summed.

John Machin
A: 

first, thank you all for your help

I have written successfully the C implementation and tested over 20 different files the checksum function. And then the 2 complement of the computed checksum is compared to a checksum a the file. It works perfectly in C as mentionned: with unsigned char and unsigned long.

the right value of the sum is 3da4be70 (from C implementation) from your suggestions:

print '%x' % ( sum( array.array('B', m[ o2+4:o2+len2 ]) ) )

and

n = map(ord, list(m[ o2+4: o2+len2 ]))
print '%x' % (sum(n))

give 504a022a

I suppose it is because bytes here in python as interpreted as signed instead of unsigned (like in C)...

update:

even

for i in range(o2+4, o2+len2):
 cchk = ( (cchk + unpack('B', m[i])[0]) & 0xffffffff)
print '%x' % (cchk)

does not work and gives again 504a022a

laurent
Is the file a binary file, or a text file with numbers? If it is a text file, you should show us some of the data. You should also tell us how you read the data in C. Finally, why create a new ID? And please edit your original question.
Alok
You haven't bothered to read the replies carefully -- none of them treated Python bytes as signed.
John Machin
A: 

SOLVED : MY FAULT, SORRY

#include<stdio.h>

unsigned long memSum(unsigned char *p, unsigned long len)
{
   unsigned long i, sum=0;

   for(i=0; i<len; i++) 
      sum = sum + *p++;

   return sum;
}  

#define LEN2SUM (0xa13b10-4)

int main(int argc, char *argv[] )
{

  FILE *f;
  unsigned char *buf;
  unsigned long sum;

  f=fopen("test2.dat", "rb");
  fseek(f, 0x7c+4, SEEK_SET); 

  buf = (unsigned char*)malloc(LEN2SUM);
  fread(buf, sizeof(char), LEN2SUM, f);
  sum = memSum( buf, LEN2SUM);
  printf("0x%08x\n", sum );

  free(buf);  
  fclose(f);


}

and

f = open('test2.dat','rb')
f.seek(0x7c+4)

m = f.read(0xa13b10-4)
print '%x' % ( ( sum(ord(c) for c in m) & 0xFFFFFFFF ) )

give the same answer, the good one

the difference is that in C, i checksum a given memory area which contains decrypted data, where decryption has been done 'in place'

in my python implementation, decryption is done in another buffer, and I still checksum the encrypted area.

since my a beginner in python, I was focused on this point : bad track. i'm kicking my ass twenty times.....

sorry for the stupid question and thanks again your kind help !!!

laurent