ansaurus

Question

How do I un-escape XML entities easily in .NET

Answer 1

+2 A:

why not inserting them as < and > ? you avoid mixing xml and custom markup stuff with this...

Joachim Kerschbaumer 2008-10-14 15:33:30

This is a valid response, the example provided in the question is NOT valid XML

Mitchel Sellers 2008-10-14 15:43:17

I have updated the example to fix the incorrect syntax. This answer is not actually a relevant answer to the question, but I accept my example was bad.

Tim Saunders 2008-10-14 16:41:00

Answer 2

+2 A:

Your question is a bit hard to follow. Here are the things that I did not fully understand:

If you are using XmlNode/XmlElement objects, you are working with XML, not HTML. So all you can have are XML elements. These may have HTML element names, but they are XML.
InnerXml returns a string, at least for the XmlElement object. What are you working with?
What data are you expecting to get out of the operation? Can you give an example on what you need?
What exactly are you intending to do with the data when you have it? Maybe there is a better way to your goal than what have in mind?

EDIT

I think I get the picture, but correct me if I'm still wrong. You want to pluck "<p>A Test</p>" out of xn1, but "A test" out of xn2.

So InnerXml is the way to go for xn1, and InnerText would be right for xn2.

Well do it that way then - test for the existence of dataitem and decide what to do when you know.

XmlNode xn = document.SelectSingleNode("/content[@id=1]/data");

if (xn.SelectSingleNode("dataitem") == null)
  Console.WriteLine(xn.InnerXml);
else
  Console.WriteLine(xn.InnerText);

To answer your question regarding HttpUtility.HtmlDecode, I just looked at the implementation and it looks like it would "work for everything", but it seems superfluous to me if the string you are looking for is coming out of InnerXml.

Tomalak 2008-10-14 16:02:57

Answer 3

A:

I think Tomalak is on the right track, but I'd write the code a little differently:

        XmlNode xn = document.SelectSingleNode("/content[@id=1]/data");
        if (xn.ChildNodes.Count != 1)
        {
            throw new InvalidOperationException("I don't know what to do if there's not exactly one child node.");
        }
        XmlNode child = xn.ChildNodes[0];
        switch (child.NodeType)
        {
            case XmlNodeType.Element:
                Console.WriteLine(xn.InnerXml);
                break;
            case XmlNodeType.Text:
                Console.WriteLine(xn.Value);
                break;
            default:
                throw new InvalidOperationException("I can only handle elements and text nodes.");
        }

This code makes a lot of your implicit assumptions explicit, and when you encounter data that's not in the form you expect, it will tell you why it failed.

Robert Rossney 2008-10-15 18:56:14

ansaurus

tags:

views:

answers:

How do I un-escape XML entities easily in .NET

related questions