views:

102

answers:

2

Using the following code to compute MD5 hashs of files:

 Private _MD5Hash As String
 Dim _BinaryData As Byte() = New Byte(FileUpload1.PostedFile.InputStream.Length) {}
 FileUpload1.PostedFile.InputStream.Read(_BinaryData, 0, _BinaryData.Length)

 Dim md5 As New System.Security.Cryptography.MD5CryptoServiceProvider
 Dim md5hash() As Byte
 md5hash = md5.ComputeHash(Me._BinaryData)
 Me._MD5Hash = ByteArrayToString(md5hash)

  Private Function ByteArrayToString(ByVal arrInput() As Byte) As String
     Dim sb As New System.Text.StringBuilder(arrInput.Length * 2)
     For i As Integer = 0 To arrInput.Length - 1
       sb.Append(arrInput(i).ToString("X2"))
     Next
     Return sb.ToString().ToLower
  End Function

We are getting different hashes depending on the create-date and modify-date of the file. We are storing the hash and the binary file in a SQL DB. This works fine when we upload the same instance of a file. But when we save a new instance of the file from the DB (with today's date as the create/modify) on the file-system and then check the new hash versus the MD5 stored in the DB they do not match, and therefor fail a duplicate check.

How can we check for a file hash excluding the file attributes? or is there a different issue here?

+1  A: 

I suspect Me._BinaryData is getting initialized with more than just the contents of the file...

Ultimately the only way the hash can change is if the byte array changes.

Another possibility is character set / encoding differences when you persist/restore the file from the DB.

John Weldon
A: 

The answer is the ol' VB issue of array declaration. The size is the UpperBound not then length.

Dim _BinaryData As Byte() = New Byte(FileUpload1.PostedFile.InputStream.Length) {}

should be:

Dim _BinaryData As Byte() = New Byte(FileUpload1.PostedFile.InputStream.Length - 1 ) {}

Every file had an extra empty byte at the end.

Glennular