I have inherited a site with a news section that displays a summary of the news article. For whatever reason the creators decided that displaying the first X characters of the article would be fine. Of course this very quickly led to the summary being something like:
<p>What a mighty fine <a href="blah">da
<p>What a mighty fine and warm <a href="htt
<p>His name was "Emil&qu
Which quite obviously screws with the page, especially when the opening tags aren't even closed.
What I'm after is a way to close all open tags within the string being taken. I really really don't want to use regex to do it. I'm sure there's a nice parser that can do it easily, I just can't seem to find it right now.