ansaurus

Question

Best way to detect duplicate uploaded files in a Java Environment?

Answer 1

+1 A:

You only need to add a method like this to your code and you're done. There's probably no better way. All the work is already done by the Digest API.

public static String calc(InputStream is ) {
        String output;
        int read;
        byte[] buffer = new byte[8192];

        try {
            MessageDigest digest = MessageDigest.getInstance("SHA-256"); //"MD5");
            while ((read = is.read(buffer)) > 0) {
                digest.update(buffer, 0, read);
            }
            byte[] hash = digest.digest();
            BigInteger bigInt = new BigInteger(1, hash);
            output = bigInt.toString(16);

        } 
        catch (Exception e) {
            e.printStackTrace( System.err );
            return null;
        }
        return output;
    }

stacker 2010-09-15 20:54:30

+1 Great example. Thank you.

S.Jones 2010-09-16 00:01:53

Answer 2

+3 A:

While processing uploaded files, decorate the OutputStream with a DigestOutputStream so that you can calculate the digest of the file while writing. Store the final digest somewhere along with the unique identifier of the file (in hex as part of filename maybe?).

BalusC 2010-09-15 21:15:47

+1 That's great. It looks like DigestOutputStream is just what I need. Thanks

S.Jones 2010-09-16 00:15:08

ansaurus

tags:

views:

answers:

Best way to detect duplicate uploaded files in a Java Environment?

related questions