ansaurus

Question

Answer 1

+1 A:

If you send UTF-8 encoded data that they treat as ISO-8859-1 then that could be the source of your problem. I suggest you either send the data in ISO-8859-1 or try to communicate to Suomen Verkkomaksut that you're sending UTF-8. In a http-based protocol you do this by adding charset=utf-8 to Content-Type in the HTTP header.

A way to rule out some issues would be to try a prehash String that only contains characters that are encoded the same in UTF-8 and ISO-8859-1. From what I can see you can achieve this by removing all "ä" characters in the string you'e used.

Buhb 2009-12-03 10:33:59

I already have both <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> and <?xml version="1.0" encoding="UTF-8"?> on the page. Unfortunately these don't seem to help. But you're right, maybe I should just contact them.

Ville Salonen 2009-12-03 10:36:58

Answer 2

+8 A:

You seem to misunderstand how string encoding works, and your Crypt class's API is suspect.

Strings don't really "have an encoding" - an encoding is what you use to convert between Strings and bytes.

Java Strings are internally stored as UTF-16, but that does not really matter, as MD5 works on bytes, not Strings. Your Crypt.md5sum() method has to convert the Strings it's passed to bytes first - what encoding does it use to do that? That's probably the source of your problem.

Your example code is pretty nonsensical as the only effect this line has:

String prehashIso = new String(prehash.getBytes("ISO-8859-1"), "ISO-8859-1");

is to replace characters that cannot be represented in ISO-8859-1 with question marks.

Michael Borgwardt 2009-12-03 10:43:55

Thanks for the clarification.

Ville Salonen 2009-12-03 11:37:30

+1 on the suspicious-ness of the `Crypt` class. It also suggests there may be a confusion between encryption and cryptographic hashing (but there may as well not be one, depending on the rest of the class).

Romain 2009-12-03 12:10:18

Answer 3

+2 A:

Java has a standard java.security.MessageDigest class, for calculating different hashes.

Here is the sample code

include java.security.MessageDigest;

// Exception handling not shown

String prehash = ...

final byte[] prehashBytes= prehash.getBytes( "iso-8859-1" );

System.out.println( prehash.length( ) );
System.out.println( prehashBytes.length );

final MessageDigest digester = MessageDigest.getInstance( "MD5" );

digester.update( prehashBytes );

final byte[] digest = digester.digest( );

final StringBuffer hexString = new StringBuffer();

for ( final byte b : digest ) {
    final int intByte = 0xFF & b;

    if ( intByte < 10 )
    {
        hexString.append( "0" );
    }

    hexString.append(
        Integer.toHexString( intByte )
    );
}

System.out.println( hexString.toString( ).toUpperCase( ) );

Unfortunately for you it produces the same "C83CF67455AF10913D54252737F30E21" hash. So, I guess your Crypto class is exonerated. I specifically added the prehash and prehashBytes length printouts to verify that indeed 'ISO-8859-1' is used. In this case both are 328.

When I did presash.getBytes( "utf-8" ) it produced "9CC2E0D1D41E67BE9C2AB4AABDB6FD3" (and the length of the byte array became 332). Again, not the result you are looking for.

So, I guess Suomen Verkkomaksut does some massaging of the prehash string that they did not document, or you have overlooked.

Alexander Pogrebnyak 2009-12-03 12:06:08

Your hash function doesn't pad with zero if byte is less than 10.

BalusC 2009-12-03 12:12:59

Ah well, maybe I'll just have to wait for an answer from them. Thanks for the provided code example.

Ville Salonen 2009-12-03 12:15:07

@BalusC. You are quite right. I've corrected my example. Always beats me why Java does not have Byte.toHexString and Byte.toUpperHexString that does the correct thing.

Alexander Pogrebnyak 2009-12-03 12:35:05

Simply use the Hex class of apache commons codec which does exactly that. I had to rebuild a HUGE amount of hashes because i used my own, and broken, implementation for byte[] to String conversion.

Malax 2009-12-03 12:51:43

ansaurus

tags:

views:

answers:

MD5 Hash of ISO-8859-1 string in Java

related questions