views:

57

answers:

1

hi,

Is there a way to transform a xml file which is iso-8859-1 to utf-8 ?

+2  A: 

The simplest way would be to load it and then save it with one of the XML APIs available. That way any extra transformations (e.g. the XML declaration) will be handled appropriately. For example:

using System;
using System.Text;
using System.Xml.Linq;

class Test
{
    static void Main(string[] args)
    {
        XDocument doc = XDocument.Load("test.xml");
        XDeclaration declaration = doc.Declaration;
        if (declaration != null) {
            declaration.Encoding = "utf-8";
        }
        doc.Save("test-utf8.xml");
    }
}

Note that I think this may end up changing some things around indentation etc, unless you specify some extra options. Is that likely to be a problem for you?

You could potentially just load the whole file as text (using Encoding.GetEncoding(28591)), modify the declaration part yourself and resave it in UTF-8 yourself. I suspect there may be some corner cases where that would cause a problem though.

Jon Skeet
@skeet: I tried your way but there is no use. The orginal file and the file after conversion show the same text "Sjöxxxxxxx xxxxxxxxx".My expected characters is "Sjöxxxxxxx xxxxxxxxx".
thndrkiss
@thndrkiss: That suggests the document itself is corrupt. Does it *state* that it's in ISO-8859-1 in the XML declaration?
Jon Skeet
@skeet. I have an observation right now. I tried to open that original xml file with text pad. It said "the content has ascii - latin chars . .do you wish to open . " i selected yes and ther it is getting displayed properly. I assume that the original file is correct. Is that right ?
thndrkiss
@thndrkiss: No, that doesn't mean the original file is correct. It could mean all kinds of things... but what does the declaration at the start of the document say?
Jon Skeet
it says that the encoding type is "iso-8859-1". You are right the file is not correct. I then tried to change the header by using notepad as encoding type="utf-8". It worked. Thanks :).
thndrkiss