views:

1111

answers:

2

To load XML files with arbitrary encoding I have the following code:

Encoding encoding;
using (var reader = new XmlTextReader(filepath))
{
    reader.MoveToContent();
encoding = reader.Encoding;
}

var settings = new XmlReaderSettings { NameTable = new NameTable() };
var xmlns = new XmlNamespaceManager(settings.NameTable);
var context = new XmlParserContext(null, xmlns, "", XmlSpace.Default, 
    encoding);
using (var reader = XmlReader.Create(filepath, settings, context))
{
    return XElement.Load(reader);
}

This works, but it seems a bit inefficient to open the file twice. Is there a better way to detect the encoding such that I can do:

 1. Open file
 2. Detect encoding
 3. Read XML into an XElement
 4. Close file
A: 

This seems to have a solution in this question:

http://stackoverflow.com/questions/581318/c-detect-xml-encoding-from-byte-array

Andrew

REA_ANDREW
Thanks. Hm... ok. I was hoping for a cleaner solution.
Peter Lillevold
+1  A: 

Ok, I should have thought of this earlier. Both XmlTextReader (which gives us the Encoding) and XmlReader.Create (which allows us to specify encoding) accepts a Stream. So how about first opening a FileStream and then use this with both XmlTextReader and XmlReader, like this:

using (var txtreader = new FileStream(filepath, FileMode.Open))
{
    using (var xmlreader = new XmlTextReader(txtreader))
    {
        // Read in the encoding info
        xmlreader.MoveToContent();
        var encoding = xmlreader.Encoding;

        // Rewind to the beginning
        txtreader.Seek(0, SeekOrigin.Begin);

        var settings = new XmlReaderSettings { NameTable = new NameTable() };
        var xmlns = new XmlNamespaceManager(settings.NameTable);
        var context = new XmlParserContext(null, xmlns, "", XmlSpace.Default,
                 encoding);

        using (var reader = XmlReader.Create(txtreader, settings, context))
        {
            return XElement.Load(reader);
        }
    }
}

This works like a charm. Reading XML files in an encoding independent way should have been more elegant but at least I'm getting away with only one file open.

Peter Lillevold