tags:

views:

56

answers:

1

Can someone please tell me how to do this in C#?

Convert from Quoted Printable to binary, and then use UTF-8 encoding to decode that binary to text.

Here is example of text I need quoted:

"Commentaires: Je suis abonn=C3=A9 =C3=A0 la livraison =C3=A0 domicile depuis longtemps."

+3  A: 

OK, so your question is basically two questions in one. First, you need to be able to decode quoted-printable. I’m assuming that you have the encoded text as a string. You’ll have to run through this string with a while-loop, like below. I’ve deliberately left out the part where you turn the two hex characters into a byte; I’m sure you can figure this out for yourself :)

var i = 0;
var output = new List<byte>();
while (i < input.Length)
{
    if (input[i] == '=' && input[i+1] == '\r' && input[i+2] == '\n')
    {
        // skip this
        i += 3;
    }
    else if (input[i] == '=')
    {
        byte b = (construct the byte from the characters input[i+1] and input[i+2]);
        output.Add(b);
        i += 3;
    }
    else
    {
        output.Add((byte)input[i]);
        i++;
    }
}

At the end of this, output contains the raw bytes. Now all you need to do is decode it using UTF8:

var outputString = Encoding.UTF8.GetString(output.ToArray());

If you have any questions, please ask in a comment. And remember: don’t copy and use code that you don’t understand :)

Timwi
By the way, the code I posted will crash when given invalid input like `"=A"`. Can you figure out why, and how to fix it?
Timwi
Sorry, I just don't get how to convert the equals C3 and equals A9 to the é character (latin small letter e with acute), nor the equals C3 and equals A0 to the à character (latin small letter a with grave).
You don’t have to. That’s what `Encoding.UTF8` does for you. All you have to do is convert the characters 'C' '3' into the byte 0xC3 (195) etc.
Timwi
Oh!! I understand now. Got it. Thank you very much!