tags:

views:

185

answers:

2

The following snippets are both supposed to calculate sha-1 sum. But for the same file they calculate different sha-1 sums.

//snippet1
byte[] byteArr = new byte[(int) uploadedFile.getLength()];
try {
 stream = new BufferedInputStream(uploadedFile.getInputStream());
 stream.read(byteArr);
 stream.close(); 
} catch (IOException e) {
 e.printStackTrace();
}
md = MessageDigest.getInstance("SHA-1"); 
byte[] sha1hash = new byte[40];
md.update(byteArr, 0, byteArr.length);
sha1hash = md.digest();

//snippet2
md = MessageDigest.getInstance("SHA-1");
InputStream is = uploadedFile.getInputStream();
try {
 is = new DigestInputStream(is, md);
} finally {
 try {
  is.close();
 } catch (IOException e) {
  e.printStackTrace();
 }
}
sha1hash = md.digest();

Can you explain why?

+12  A: 

Both of your snippets are buggy:

  • The first snipped reads some (effectively random) amount of bytes from the file and is in no way guaranteed to read the whole file (read the JavaDoc of read() for details).

  • The second snipped doesn't read anything at all from the InputStream and therefore returns the SHA-1 of the empty stream (0 bytes read).

Joachim Sauer
+1 since this is correct
Malax
There's no method `read` in `BufferedInputStream` which takes a single argument. Wherever this came from, it doesn't appear to be from code which even compiled.
Vinay Sajip
Vinay: there definitely is such a method. A `BufferedInputStream` is-a `InputStream` which has that method.
Joachim Sauer
+3  A: 

You have a bug here:

 stream = new BufferedInputStream(uploadedFile.getInputStream());
 stream.read(byteArr);
 stream.close();

The read() method does not automatically fill the array that's passed into it - it will read an arbitrary number of bytes and return that number. You have to loop and add up the returned byte count till the array is filled.

Almost everyone gets that wrong the first time, but it's one reason why the input stream based method is better (the other being that for large files, you definitely don't want to keep them in memory completely).

Michael Borgwardt