You can easily build a wrapper object around the hash object which can transparently persist the data.
The obvious drawback is that it needs to retain the hashed data in full in order to restore the state - so depending on the data size you are dealing with, this may not suit your needs. But it should work fine up to some tens of MB.
Unfortunattely the hashlib does not expose the hash algorithms as proper classes, it rathers gives factory functions that construct the hash objects - so we can't properly subclass those without loading reserved symbols - a situation I'd rather avoid. That only means you have to built your wrapper class from the start, which is not such that an overhead from Python anyway.
here is a sample code that might even fill your needs:
import hashlib
from cStringIO import StringIO
class PersistentSha1(object):
def __init__(self, salt=""):
self.__setstate__(salt)
def update(self, data):
self.__data.write(data)
self.hash.update(data)
def __getattr__(self, attr):
return getattr(self.hash, attr)
def __setstate__(self, salt=""):
self.__data = StringIO()
self.__data.write(salt)
self.hash = hashlib.sha1(salt)
def __getstate__(self):
return self.data
def _get_data(self):
self.__data.seek(0)
return self.__data.read()
data = property(_get_data, __setstate__)
You can access the "data" member itself to get and set the state straight, or you can use python pickling functions:
>>> a = PersistentSha1()
>>> a
<__main__.PersistentSha1 object at 0xb7d10f0c>
>>> a.update("lixo")
>>> a.data
'lixo'
>>> a.hexdigest()
'6d6332a54574aeb35dcde5cf6a8774f938a65bec'
>>> import pickle
>>> b = pickle.dumps(a)
>>>
>>> c = pickle.loads(b)
>>> c.hexdigest()
'6d6332a54574aeb35dcde5cf6a8774f938a65bec'
>>> c.data
'lixo'