views:

6477

answers:

4

Hi

See subject, note that this question only applies to the .NET compact framework. This happens on the emulators that ship with Windows Mobile 6 Professional SDK as well as on my English HTC Touch Pro (all .NET CF 3.5). iso-8859-1 stands for Western European (ISO), which is probably the most important encoding besides us-ascii (at least when one goes by the number of usenet posts).

I'm having a hard time to understand why this encoding is not supported, while the following ones are supported (again on both the emulators & my HTC):

  • iso-8859-2 (Central European (ISO))
  • iso-8859-3 (Latin 3 (ISO))
  • iso-8859-4 (Baltic (ISO))
  • iso-8859-5 (Cyrillic (ISO))
  • iso-8859-7 (Greek (ISO))

So, is support for say Greek more important than support for German, French and Spanish? Can anyone shed some light on this?

Thanks!

Andreas

+2  A: 

This MSDN article says:

The .NET Compact Framework supports character encoding on all devices: Unicode (BE and LE), UTF8, UTF7, and ASCII.

There is limited support for code page encoding and only if the encoding is recognized by the operating system of the device.

The .NET Compact Framework throws a PlatformNotSupportedException if the a required encoding is not available on the device.

I believe all (or at least many) of the ISO encodings are code-page encodings and fall under the "limited support" rule. UTF8 is probably your best bet as a replacement.

Otis
HiIt seems none of the linked pages contain information that could answer the question "Why is ISO-8859-7 supported while ISO-8859-1 isn't?"
Andreas Huber
I thought the "limited support for code page supporting" wrapped it up rather succinctly, myself. I believe you're stuck with using UTF8.
Otis
+2  A: 

I would try to use "windows-1252" as encoding string. According to Wikipedia, Windows-1252 is a superset of ISO-8859-1.

System.Text.Encoding.GetEncoding(1252)
splattne
HiThis approach is very cumbersome in some situations, e.g. when you have an XML stream with a ISO-8859-1 encoding. You'd need to replace the encoding "iso-8859-1" with "windows-1252" before feeding it to the XML reader.
Andreas Huber
A: 

Have you tried uppercasing the character set name? The official registration doesn't include the lowercase name that you provided (which doesn't explain why it accepts lowercased versions of the other ISO-8859 variants).

Name: ISO_8859-1:1987                                    [RFC1345,KXS2]
MIBenum: 4
Source: ECMA registry
Alias: iso-ir-100
Alias: ISO_8859-1
Alias: ISO-8859-1 (preferred MIME name)
Alias: latin1
Alias: l1
Alias: IBM819
Alias: CP819
Alias: csISOLatin1
kdgregory
A: 

It is odd that 8859-1 isn't supported, but that said, UTF-8 does have the ability to represent all of teh 8859-1 characters (and more), so is there a reason you can't just use UTF-8 instead? That's what we do internally, and I just dealt with almost this same issue today. The plus side of using UTF-8 is that you get support for far-east and cyrillic languages without making modifications and without adding weight to the western languages.

ctacke
I don't see how UTF-8 "contains" iso-8859-1. In the former, only pure ASCII characters can be encoded in one byte, in the latter also chars like e.g. Ä can be encoded in one byte. So if you have a file encoded in ISO-8859-1 you cannot correctly read it with the UTF-8 encoding.
Andreas Huber
Ah, I see - you have text already encoded. I was thinking that you simply needed a mechanism for encoding all of these chanracters
ctacke
Looks like you'll be manually converting the files, then. Good luck.
Robert C. Barth