tags:

views:

386

answers:

6
+2  A: 

I would think that would depend on the character set/encoding you had defined for the XML file.

phoebus
i've used utf-8, but that's more of a habbit then thinking which to use.. it's a resource file for asp.net
Michel
And is the file actually encoded as utf-8 or did you just slap that bit into the ?xml tag at the start? Check in Visual Studio with the File->Advanced save options dialog.
Lasse V. Karlsen
Thanks Lasse, you've helped
Michel
+3  A: 

I am pretty sure that this is an encoding problem. You need to check that the encoding of your file is indeed something internationalised, like UTF-8, and that the xml header indicates this.

The xml file should start with <?xml version="1.0" encoding="UTF-8"?>

UberAlex
it does start that way
Michel
UTF-8 is the default encoding for XML and thus doesn't need to be specified.
Joey
Michel: Then check with your favorite text editor whether the actual encoding of the file matches that.
Joey
@Johannes: UTF8 being the default encoding is not entirely true: http://www.opentag.com/xfaq_enc.htm
soulmerge
Ok, correction: The default encoding for an XML file created by someone without any clue of character sets is *most likely* UTF-8. That being said, the default behavior does make sense and I highly doubt anyone would try using UTF-16 or -32 and mistake it for ISO-8859-*.
Joey
Thanks all for getting me on the right track. The problem was indeed in the encoding. The file was emaild to me and saved to disk in MSOutlook. It was then saved (don't know if it had another encoding at the senders side) as ASCII. When i saved it as UTF-8 (instead of only declaring it as utf-8) it worked ok.
Michel
@Johannes you're right, but I have had experience in the past with situations where MSFT apps like explorer approach ambiguous encoding situations with an encoding other than utf-8.
UberAlex
Alex: Explorer doesn't care about encodings at all. After all, it is just there to show you a view of your file system. But yes, the default assumption on Windows is either the legacy codepage or UTF-16, both of which are pretty distinct.
Joey
+1  A: 

This is an encoding issue. If the encoding of the file is provided in the xml, it should be recognized correctly. If your file is latin1, for example, the xml must start with this line:

<?xml version="1.0" encoding="ISO-8859-1"?>

You can omit the encoding attribute, determining the default encoding of the xml can be a bit tricky, though.

soulmerge
Seems you actually don't have to declare UTF-8. "parsed entities which are stored in an encoding other than UTF-8 or UTF-16 MUST begin with a text declaration" - http://www.w3.org/TR/xml/#charencoding
Jonas Elfström
@Jonas: Thanks, changed the example to latin1.
soulmerge
"You can omit this line": Note that from XML 1.1 the XML declaration must be present.
0xA3
@divo: Thanks, updated answer
soulmerge
+2  A: 

My guess is that your text in encoded in ISO-8859-1 since that is commonly used in Sweden.

Try adding:

<?xml version='1.0' encoding='ISO-8859-1'?>

I would consider converting the text to UTF-8.

Jonas Elfström
+1  A: 

You allways can use entities as this:

<test>
&#228;
&#252;
&#229;
</test>

to get:

<test>
ä
ü
å
</test>

Maybe not exactly what you want, but a nice workaround. You can use sites like utf8-chartable.de to look up the needed value.

Tim Büthe
Yep, this will take care of any encoding issues.
carillonator
A: 

Make sure that you actually save the file using the encoding that is specified in the XML.

The Notepad for example by default saves files as ANSI rather than UTF-8. Use the "Save As..." option so that you can specify the encoding.

I saved your XML as an UTF-8 file, and that shows up just fine in IE.

Guffa