tags:

views:

152

answers:

1

I'm trying to calculate the SHA-1 value of a file.

I've fabricated this script:

def hashfile(filepath):
    sha1 = hashlib.sha1()
    f = open(filepath, 'rb')
    try:
        sha1.update(f.read())
    finally:
        f.close()
    return sha1.hexdigest()

For a specific file I get this hash value:
8c3e109ff260f7b11087974ef7bcdbdc69a0a3b9
But when i calculate the value with git hash_object, then I get this value: d339346ca154f6ed9e92205c3c5c38112e761eb7

How come they differ? Am I doing something wrong, or can I just ignore the difference?

+7  A: 

git calculates hashes like this:

sha1("blob " + filesize + "\0" + data)

Reference

Brian R. Bondy
I should have looked it up, thanks.
Ikke
No prob, the reference'd link is quite different, just happened to find it by luck.
Brian R. Bondy
It should be mentioned that git does this to avoid length extension attacks.
Omnifarious