views:

62

answers:

2

I'm downloading a vCard to the browser using Response.Write to output .NET strings with special accented characters. Mime type is text/x-vcard and French characters are appearing wrong in Outlook, for example Montréal;Québec .NET string shows as Montréal Québec in browser.

Apparently vCard default format is ASCII. .NET strings are Unicode UTF-16.

I'm using this vCard generator code from CodeProject.com

I've played with the System.Encoding sample code at the bottom of this linked MSDN page to convert the unicode string into bytes and then write the ascii bytes but then I get Montr?al Qu?bec (progress but not a win). Also I've tried setting content type to both us-ascii and utf-8 of the response.

If I open the downloaded vCard in Windows Notepad and save it as ANSI text (instead of default unicode format) and open in Outlook it's okay. So my assumption is I need to cause download of ANSI charset but am unsure if I'm doing it wrong or have a misunderstanding of where to start.

Update: Looking at the raw HTTP, it appears my French characters are being downloaded in the unexpected format so it looks like I need to do some work on the server side... raw (full size)

+1  A: 

é is what é looks like when it's encoded as UTF-8 and mistakenly decoded as ISO-8859-1 or windows-1252 (or "ANSI", as Microsoft apps like to call it). When you open the file in Notepad, it automatically detects the encoding as UTF-8. Then you change the encoding by saving it as "ANSI", which works because é is supported by that encoding as well.

When you view the page in Outlook, what does the it say the encoding is? That HTTP dump looks like well-formed UTF-8 to me, but Outlook seems to be reading it as ISO-8859-1 or windows-1252. I don't use Outlook and I don't know its quirks; are you sure you got the headers right?

Alan Moore
A: 

You don't need to convert anything! Just specify in the HTTP response headers on the text/x-vcard document that the response is UTF-8 encoded (Response.CharSet or Response.ContentEncoding or similar - not sure what your specific situation is).

Also, you could try emitting an UTF-8 Byte Order Mark to help the client determine the encoding.

bzlm
Emitting UTF8 caused two bytes per some Latin chars and those vCards couldn't be interpreted. I think UTF-8 is compatible with ANSI but not ASCII. An ANSI format worked. I'm still looking into it.
John K
Correction: I should have said I think UTF-8 is compatible with ASCII but not ANSI.
John K
@jdk UTF-8 is compatible with ASCII because ASCII is a subset of UTF-8 - or rather, UTF-8 was designed to be backwards compatible with ASCII. When you say "couldn't be interpreted", what does that mean? By whom? And what was the error?
bzlm
Pic in the question shows some two-byte sequences for French characters (e.g. Montreal) instead of one. VCard needs consistent 1 byte chars.
John K
@jdk If you use ISO-8859-1 instead of UTF-8, it will probably work on your machine, since ISO-8859-1 contains most French characters, but only because your Windows is set to fall back to that particular 8-bit encoding when the actual encoding is unknown.
bzlm