views:

1301

answers:

2

I often have to deal with XML documents that contain namespaced elements, but doesn't declare the namespace. For example:

<root>
  <a:element/>
</root>

Because the prefix "a" is never assigned a namespace URI, the document is invalid. When I load such an XML document using the following code:

using (StreamReader reader = new StreamReader(new FileStream(inputFileName,    
       FileMode.Open, FileAccess.Read, FileShare.ReadWrite))) {
            doc = XDocument.Load(reader, LoadOptions.PreserveWhitespace);
}

it throws an exception stating (rightly) that the document contains an undeclared namespace and is not well-formed.

So, can I predefine default namespace prefix -> namespace URI pairs for the parser to fall back on? XMLNamespaceManager looks promising, but don't know how to apply it to this situation (or if I can).

+2  A: 

You can create an XmlReader with an XmlParserContext that knows about the namespaces; the following works for XmlDocument and XDocument:

class SimpleNameTable : XmlNameTable {
    List<string> cache = new List<string>();
    public override string Add(string array) {
        string found = cache.Find(s => s == array);
        if (found != null) return found;
        cache.Add(array);
        return array;
    }
    public override string Add(char[] array, int offset, int length) {
        return Add(new string(array, offset, length));
    }
    public override string Get(string array) {
        return cache.Find(s => s == array);
    }
    public override string Get(char[] array, int offset, int length) {
        return Get(new string(array, offset, length));
    }
}
static void Main() {
    XmlNamespaceManager mgr = new XmlNamespaceManager(new SimpleNameTable());
    mgr.AddNamespace("a", "http://foo/bar");
    XmlParserContext ctx = new XmlParserContext(null, mgr, null,
        XmlSpace.Default);
    using (XmlReader reader = XmlReader.Create(
        new StringReader(@"<root><a:element/></root>"), null, ctx)) {

        XDocument doc = XDocument.Load(reader);

        //XmlDocument doc = new XmlDocument();
        //doc.Load(reader);
    }
}
Marc Gravell
Thanks, Marc. Works most of the way. The problem is if I reserialize the document, I end up with <root><element xmlns="http://foo/bar" /></root>which while technically correct, doesn't preserve the namespace prefix. Can I make it preserve the prefix?
James Sulak
hmmm... nothing leaps to mind... maybe add the xmlns declarations during serialization and then remove them manually? Yeuck.
Marc Gravell
A: 

Building on the previous answer, you can preserve the namespace prefixes by first loading into an XmlDocument and parsing the OuterXml of the XmlDocument into an XDocument

XDocument LoadWithPrefix(Stream stream)
{
    XmlNamespaceManager mgr = new XmlNamespaceManager(new NameTable());
    mgr.AddNamespace("a", "http://foo/bar");
    XmlParserContext ctx = new XmlParserContext(null, mgr, null, XmlSpace.Default);
    using (XmlReader reader = XmlReader.Create(stream, null, ctx)) 
    {
        XmlDocument doc = new XmlDocument();
        doc.Load(reader);
        return XDocument.Parse(doc.OuterXml);
    }
}
Gideon Engelberth