views:

361

answers:

2

Hi,

I am trying to MD5 a string in ActionScript using the MD5 algorithm that was created by Adobe and is part of AS3corelib. (http://as3corelib.googlecode.com/svn/trunk/src/com/adobe/crypto/MD5.as).

I am comparing this to an MD5 created in php that I know is correct.

If I create MD5s using AS and PHP for say a string like "abcd1234" they both are equal, as is to be expected. The problem is, when my string contains some hexadecimal in it ie "abcd\x28\xBF\x4E", the MD5s from ActionSCript and php return different value.

Now the really strange part is as long as the hexadecimal is in the form of a number when its a string its fine and still matches:

ie

"abcd\x28\x46" will have matching values from AS's MD5 and php's MD5. While "abcd\x28\xBF" will yield different hashes.

Anyone have any ideas? I've tested the php MD5 thoroughly and I know it is correct and the ActionScript is incorrect. I appreciate the help, thanks for reading and I apologize if this was confusing. I'm a noob when it comes to string encoding, representation etc. Thanks, Drew S.

A: 

Most likely, PHP and ActionScript are using different encodings for strings; one is probably using ISO-8859-1 and the other is using UTF-8.

For abcd\x28\xBF, the values are:

  • fcfebaeb81afe401c4b608dc684ad08f under ISO-8859-1
  • 47ef883a009ddbe01711ece0a0a8764e under UTF-8

And for abcd\x28\xBF\x4E (your other example), the values are:

  • ea382d63efca32d8d7861a314a6112e3 under ISO-8859-1
  • dc11cdbaa05aa41640a821fb8e290eae under UTF-8
Chris Jester-Young
This was exactly it. When the string is passed to the MD5 function it is converted to a ByteArray. It was using writeUTFBytes(stringname);switching towriteMultiByte(stringname, "iso-8859-1");fixed it. I really appreciate the help Chris.
Outclassed
Crap, now when \x00 appears in the string it is causing the writeMultiByte to stop and just end. Let me see if I can figure this one out.
Outclassed
A: 

Your second problem is due to strings being commonly defined as NUL (or zero) terminated buffers.

There's a workaround, though. iso-8859-1 defines 256 possible characters (including the NUL char). The first 256 code points in UTF are the same as in iso-8859-1 (the encoding may differ if you use UTF-8, UTF-16, etc, but the codepoints are the same regarless how you enconde those codepoints).

So, if you know that all of the codepoints in your string will be in the range 0-255 (since it's latin1) and you know it's ok to have embedded NULs, you can manually iterate over your string, get the codepoint of each char and store it as a byte in your buffer. Something like this:

var s:String = "abc\x00d\x28\xBF";
var buffer:ByteArray = new ByteArray();
var len:int = s.length;
for(var i:int = 0; i < len; i++) {
    buffer.writeByte(s.charCodeAt(i));
}

//  trace it
buffer.position = 0;
while(buffer.bytesAvailable) {
    trace("0x" + buffer.readUnsignedByte().toString(16));
}
Juan Pablo Califano
Awesome, This works for the second issue. Thank you Juan, everything is good to with the MD5 function. Ive got one last issue about encoding that I may need help on that I will post about tomorrow dealing with hex values 80-9F being encoded as 3F, but values above and below being fine, ie 2E and A0 when using the FileStream.WriteBytes() function.
Outclassed
Seems the third problem was another encoding issue. The program was using Windows-1252 as the encoding type which was dropping the 80-9F (even though Wikipedia shows it supporting characters in this range). Switching to ISO 8859-1 fixes the issue.
Outclassed
Cool. Glad to see you sorted it out.
Juan Pablo Califano