ansaurus

Question

Is there any case insensitive Unicode character encoding class ?

Answer 1

+1 A:

Your current code isn't using either ASCIIEncoding or UTF8Encoding... it's using the default encoding on the system, because it's equivalent to just Encoding.Default. Accessing that static property via the two subclasses makes no difference.

To use ASCII or UTF-8, use Encoding.ASCII or Encoding.UTF8.

Now, regarding case-sensitivity... no, that's not the job of encodings. Your question is quite confusing as it claims it is giving you the same result for "abc" and "ABC", which I seriously doubt. I assume you mean you want it to give you the same result, but it currently doesn't.

I suggest you use something like this, if you want case insensitivity:

string lower = string.ToLowerInvariant();
byte[] data = Encoding.UTF8.GetBytes(lower);
byte[] hash = md5.ComputeHash(encodedBytes);
return BitConverter.ToString(hash);

Note that this gives case-insensitivity in a culture-insensitive way... which may not be ideal in all cases, but is at least consistent regardless of which culture you're using.

Jon Skeet 2010-09-29 16:58:57

Answer 2

A:

All character encodings encode upper and lower case letters using different bytes, so there is no way to get an encoding that will do that for you.

You can always upper/lower case the string before hashing.

Oded 2010-09-29 16:59:26

Answer 3

A:

Translating character symbols to ordinals (bytes) will ALWAYS give you a different answer when you encode uppercase vs lowercase, because those two symbols are represented by two different bytecodes in the codepage. That's true for any character encoding, whether it's ASCII, Unicode, etc.

To get a case-insensitive hash, always call ToUpper() on the string before encoding it into bytes and hashing it.

KeithS 2010-09-29 17:01:15

ansaurus

tags:

views:

answers:

Is there any case insensitive Unicode character encoding class ?

related questions