tags:

views:

71

answers:

3

So, I have some data in the form of:

<foo><bar>test</bar></foo>

What .NET classes/functions would I want to use to convert this to something pretty and write it out to a file looking something like this:

<foo>
   <bar>
       test
   </bar>
</foo>

Be specific on the functions and classes please, not just "use System.XML". There seems to be a lot of different ways to do things in .NET using XML :(

Thanks

+2  A: 

Using the System.Xml.XmlDocument class...

Dim Val As String = "&lt;foo&gt;&lt;bar&gt;test&lt;/bar&gt;&lt;/foo&gt;"
Dim Xml As String = HttpUtility.HtmlDecode(Val)

Dim Doc As New XmlDocument()
Doc.LoadXml(Xml)

Dim Writer As New StringWriter()
Doc.Save(Writer)

Console.Write(Writer.ToString())
Josh Stodola
-1 for VB, +2 for being exactly what I wanted. Thanks :)
Polaris878
Also, is there an alternative to the call to HttpUtility.HtmlDecode(str)?? I don't like having to pull in System.Web just for that function...
Polaris878
XmlDocument isn't actually doing anything at all here, as written. HtmlDecode is doing all of the work.If you skip the HtmlDecode call, and use XmlDocument to pull out XmlElements/XmlAttribute values (via .ChildNodes, .SelectNode[s], etc), the Values of those objects will be correctly unescaped.
technophile
@technophile... So I'm guessing XmlDocument will do that anyways?
Polaris878
@Polaris Yes, although if you just dump the XmlDocument to a string like he's doing here, it will re-escape them (because it's XML encoding the values). You need to use the XML APIs to pull the values out correctly.
technophile
Ah, wait, I see. You want to unescape and then pretty print the results. Yes, in that case using HTMLDecode to turn the entities back into angle brackets etc and using XmlDocument to insert whitespace is probably the best you'll get.
technophile
@Polaris Unfortunately, you'll have to have a reference to System.Web in order to use `HttpUtility`. You could roll your own decoding function, but it's a heck of a lot harder to decode HTML than encode, in my opinion. Perhaps you can look in Reflector and get what you need.
Josh Stodola
+2  A: 

you can use this code.

string p = "&lt;foo&gt;&lt;bar&gt;test&lt;/bar&gt;&lt;/foo&gt;";
Console.WriteLine(System.Web.HttpUtility.HtmlDecode(p));
Adeel
A: 

Here's one that I use, pass in an Xml string, set ToXml to true if you want to convert a string containing "<foo/><bar/>" to the native xml equivalent, "#lt;foo/#gt;#lt;bar#gt;" - replace the hash with the ampersand as this editor keeps escaping it...likewise, if ToXml is false, it will convert a string containing the "#lt;foo/#gt;#lt;bar#gt;" (replace the hash with the ampersand)to "<foo/><bar/>"

string XmlConvert(string sXml, bool ToXml){
    string sConvertd = string.Empty;
    if (ToXml){
       sConvertd = sXml.Replace("<", "#lt;").Replace(">", "#gt;").Replace("&", "#amp;");
    }else{
       sConvertd = sXml.Replace("#lt;", "<").Replace("#gt;", ">").Replace("#amp;", "&");
    }
    return sConvertd;
}

(replace the hash with the ampersand as this editor keeps escaping it within the pre tags)

Edit: Thanks to technophile for pointing out the obvious, but that is designed to cover only the XML tags. That's the gist of the function, which can be easily extended to cover other XML tags and feel free to add more that I may have missed out! Cheers! :)

Hope this helps, Best regards, Tom.

tommieb75
-1: Doesn't correctly handle all escaped values (Unicode values, other XML entity values, etc).
technophile
It won't handle quotes, either, which are pretty important in handling attributes. Using a specific list to try to do Replaces is inherently worse than using an API that conforms to the XML specification and will handle everything correctly without needing bandaids next time you want to handle " or whatever.
technophile