Is there any linux command line implementation that performs exceptionally well for generating sha1's on large files (< 2GB)?
I have played around with 'openssl sha1' and it takes minutes to get the sha1 for a 2GB file : /.
Is there any linux command line implementation that performs exceptionally well for generating sha1's on large files (< 2GB)?
I have played around with 'openssl sha1' and it takes minutes to get the sha1 for a 2GB file : /.
sha1sum
is what I'd use for computing SHA-1 checksums... it's designed to do exactly one thing so I would hope it does it as fast as practically possible. I don't have any 2GB files to benchmark it on though :-(
EDIT: After some tests on an ISO image it looks like the limiting factor on my system is disk I/O speed - not surprising, although I feel kind of silly for not thinking of that earlier. Once that's corrected for, it seems like openssl is about twice as fast as sha1sum...
Your problem is likely disk I/O. A basic SHA1 implementation on an old 2.0GHz Core Duo processor can process /dev/zero at 100MiB/s - faster than most hard drives typically paired with such a system.
Show us the speeds you're currently seeing (and on what spec hardware).
On my machine, for a file of 1GB, with enough memory to have the entire file cached in memory after the first run:
sha1sum: 3.92s
openssl sha1: 3.48s
python hashlib.sha1: 3.22s
it takes minutes to get the sha1 for a 2GB file
There's something wrong there then, unless you're using incredibly slow old hardware. Even on the first run, where the file was being read directly from disc, it was only taking ‘openssl sha1’ about 20s per gig on my machine. Are you having slow I/O problems in general?
I don't think that a SHA algorithm could be optimized for size, since it operates on blocks of a fixed size, and the computation cannot be done in parallel. It seems that the fastest implementation on a small file will also be the fastest on a large file.