views:

40

answers:

1

Is there a simple way to fix elements in a html document that miss the ending tag, or /> ending? I'm using ASP.NET with c# (loads html with the help of Html Agility Pack).

An example:

<img src="www.example.com/image.jpg"> 

should transform into

<img src="www.example.com/image.jpg" /> 

or

<img src="www.example.com/image.jpg"></img>
+1  A: 

You can use the save() method to convert the Html document to XML. Doing this, HTMLAgilitypack will try to close all the open tags.

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);    
System.IO.StringWriter sw = new System.IO.StringWriter();
System.Xml.XmlTextWriter xw = new System.Xml.XmlTextWriter(sw);
doc.Save(xw);
string result = sw.ToString();
Alejandro Martin
Worked great! Thanks.
Andreas