views:

423

answers:

2

So I have some XML in the following format:

<somenode>
    <html xmlns="http://www.w3.org/1999/xhtml"&gt;
        <head>
            <title/>
        </head>
        <body>
            <p>P one</p>
            <p>Another p</p>
        </body>
    </html>
</somenode>

Nestled in there is some html, which I didn't think would be an issue as it would just be treated as xml.

I'm trying to select the contents (InnerXml) of the <body> tag. However, using

xmlDoc.SelectSingleNode("somenode/html/body")

returns null, and using

xmlDoc.GetElementsByTagName("body")[0].InnerXml

gives the InnerXml - but each <p> has xmlns="http://www.w3.org/1999/xhtml" appended to it - so the result looks like:

<p xmlns="http://www.w3.org/1999/xhtml"&gt;P one</p><p xmlns="http://www.w3.org/1999/xhtml"&gt;Another p</p>

Can anyone shed some light on this? Seems like some really weird behavior, any help would be appreciated. I'm only using ASP.net 2.0, so unfortunately trying linq isn't an option.

A: 

Since the <html> element defines the default namespace to be http://www.w3.org/1999/xhtml. All elements inside it without a namespace prefix have the same namespace by default.

Since the content of the body tag is 2 separate <p> elements, they both get the declaration. If you had other elements inside your <p> elements, they will not have the declaration on them.

Strelok
A: 

Your xpath expression isn't specifying the default namespace. How about:

XmlNamespaceManager nsMgr = new XmlNamespaceManager(xmlDoc.NameTable);
nsMgr.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml");

XmlNode node = xmlDoc.SelectSingleNode("somenode/xhtml:html/xhtml:body", nsMgr);
David Norman