ansaurus

Question

Answer 1

+7 A:

.Net offers the option of throwing an exception if the encoding conversion fails. You'll need to use the EncoderExceptionFallback class (throws a EncoderFallbackException if an input character cannot be converted to an encoded output byte sequence) to create an encoding. The following code is from the documentation for that class:

Encoding ae = Encoding.GetEncoding(
              "us-ascii",
              new EncoderExceptionFallback(), 
              new DecoderExceptionFallback());

then use that encoding to perform the conversion:

// The input string consists of the Unicode characters LEFT POINTING 
// DOUBLE ANGLE QUOTATION MARK (U+00AB), 'X' (U+0058), and RIGHT POINTING 
// DOUBLE ANGLE QUOTATION MARK (U+00BB). 
// The encoding can only encode characters in the US-ASCII range of U+0000 
// through U+007F. Consequently, the characters bracketing the 'X' character
// cause an exception.

string inputString = "\u00abX\u00bb";
byte[] encodedBytes = new byte[ae.GetMaxByteCount(inputString.Length)];
int numberOfEncodedBytes = 0;
try
{
    numberOfEncodedBytes = ae.GetBytes(inputString, 0, inputString.Length, 
                                       encodedBytes, 0);
}
catch (EncoderFallbackException e)
{
    Console.WriteLine("bad conversion");
}

This MSDN page, "Character Encoding in the .NET Framework" discusses, to some degree, the rationale behind the default conversion behavior. In summary, they didn't want to disturb legacy applications that depend on this behavior. They do recommend overriding the default, though.

Michael Petrotta 2010-10-04 21:50:12

Great explanation. I had seen the sentence "You might want to consider having your application set EncoderFallback or DecoderFallback to EncoderExceptionFallback or DecoderExceptionFallback to prevent sequences with the 8th bit set." in the documentation but it wasn't obvious to me that it could be used for strict conversions.

AndiDog 2010-10-04 22:12:09

ansaurus

tags:

views:

answers:

Strict string to byte encoding in C#

related questions