views:

20

answers:

3

I am using the following code for charater encoding of unicode charater. It is giving me the different string value of MD5EncryptedString when I use the value of the DataToEncrypt as 'abc' & 'ABC'

 String DataToEncrypt="abc";
 String MD5EncryptedString = String.Empty;
 MD5 md5 = new MD5CryptoServiceProvider();
 Byte[] encodedBytes = ASCIIEncoding.Default.GetBytes(DataToEncrypt);
 // Byte[] encodedBytes = UTF8Encoding.Default.GetBytes(DataToEncrypt);
  encodedBytes = md5.ComputeHash(encodedBytes);
  MD5EncryptedString = BitConverter.ToString(encodedBytes);
 return MD5EncryptedString;

Is there any class instead of the ASCIIEncoding which will give me the case insensitive string means it will give me the value of the MD5EncryptedString same for both the 'abc' & 'ABC' for the DataToEncrypt variable ?

+1  A: 

Your current code isn't using either ASCIIEncoding or UTF8Encoding... it's using the default encoding on the system, because it's equivalent to just Encoding.Default. Accessing that static property via the two subclasses makes no difference.

To use ASCII or UTF-8, use Encoding.ASCII or Encoding.UTF8.

Now, regarding case-sensitivity... no, that's not the job of encodings. Your question is quite confusing as it claims it is giving you the same result for "abc" and "ABC", which I seriously doubt. I assume you mean you want it to give you the same result, but it currently doesn't.

I suggest you use something like this, if you want case insensitivity:

string lower = string.ToLowerInvariant();
byte[] data = Encoding.UTF8.GetBytes(lower);
byte[] hash = md5.ComputeHash(encodedBytes);
return BitConverter.ToString(hash);

Note that this gives case-insensitivity in a culture-insensitive way... which may not be ideal in all cases, but is at least consistent regardless of which culture you're using.

Jon Skeet
A: 

All character encodings encode upper and lower case letters using different bytes, so there is no way to get an encoding that will do that for you.

You can always upper/lower case the string before hashing.

Oded
A: 

Translating character symbols to ordinals (bytes) will ALWAYS give you a different answer when you encode uppercase vs lowercase, because those two symbols are represented by two different bytecodes in the codepage. That's true for any character encoding, whether it's ASCII, Unicode, etc.

To get a case-insensitive hash, always call ToUpper() on the string before encoding it into bytes and hashing it.

KeithS