hi,
i need to transfer large files across network and need to create checksum for them on hourly basis. so the speed for generating checksum is critical for me.
somehow i can't make zlib.crc32 and zlib.adler32 working with files larger than 4GB on Windows XP Pro 64bit machine. i suspect i've hit the 32bit limitation here? using hashlib.md5 i could get a result but the problem is the speed. it takes roughly about 5 minutes to generate an md5 for 4.8GB file. task manager shows that the process is using one core only.
my questions are:
- is there a way to make crc works on large file? i prefer to use crc than md5
- if not then is there a way to speed up the md5.hexdigest()/md5.digest? or in this case any hashlib hexdigest/digest? maybe spliting it into multi thread process? how do i do that?
many thanks guys,
Pietra Arumaga
PS: i'm working on somethimg similar like an "Asset Management" system, kind of like svn but the asset consist of large compressed image files. the files have tiny bit incremental changes. the hashing/checksum is needed for detecting changes and error detection.