tags:

views:

482

answers:

2

Hello I need to write code in python language for comparing the text of document uning fingerprint techniques. I do not know to take fingerprint of a document or to generate fingerprint of a document. Please if anyone know the method or have a source code for generating fingerprints of documents which is stored in bits form. please guide me or send me. I am very thankful for it. thanks

+3  A: 

If you want message digests (cryptographic hashes), use the hashlib library. Here's an example (IPython session):

 In [1]: import hashlib

 In [2]: md = hashlib.sha256(open('/tmp/Calendar.xls', 'rb').read())

 In [3]: md.hexdigest()
 Out[3]: '8517f1eae176f1a20de78d879f81f23de503cfd6b8e4be1d798fb2342934b187'
Cristian Ciupitu
+3  A: 

You might try the following papers to get started with the concept of fingerprinting:

Hank Gay