In the my application, i save urls content into specific table of database. to have minimum duplication, i want to compute checksum for each content. so what is best sqlserver data-type for saving checksum's? and fastest way to computing checksum's for contents(html) of urls?
views:
53answers:
2
+2
A:
SHA1 could be used to calculate the checksum. The result is a byte array which could be stored either as hex string or blob field in SQL but I think for practical reasons a string would be more convenient.
Darin Dimitrov
2010-09-11 15:13:13
+1
A:
you can use a built in function in sql server to compute any of these( MD2, MD4, MD5, SHA, or SHA1)
examples
SELECT HashBytes('MD5', 'http://www.cnn.com')
that returns the varbinary datatype 0xC50252F4F24784B5D368926DF781EDE9
SELECT CONVERT(VARCHAR(32),HashBytes('MD5', 'http://www.cnn.com'),2)
that returns a varchar C50252F4F24784B5D368926DF781EDE9
Now all you have to do is picking if you want varchar or varbinary and use that for your column
See Generating a MD2, MD4, MD5, SHA, or SHA1 hash by using HashBytes
SQLMenace
2010-09-11 15:27:11
OK, this is a good approach. but there is limitation (max length of input is 8000 bytes)
Sadegh
2010-09-11 16:52:23