views:

440

answers:

2

Hi there!

I've got a variable which is formatted with random HTML code. I call it to {$text} and i truncate it.

The value is for example:

<div>Lorem <i>ipsum <b>dolor <span>sit </span>amet</b>, con</i> elit.</div>

If i truncate the text's first ~30 letters, I'll get this:

<div>Lorem <i>ipsum <b>dolor <span>sit 

The problem is, I can't close the elements. So, I need a script, which check the <*> elements in the code (where * could be anything), and if it dont have a close tag, close 'em.

Please help me in this. Thanks.

Solution after hours, and 4 vote-up @ stackoverflow:

PHP:

...
function closetags($content) {
   preg_match_all('#<(?!meta|img|br|hr|input\b)\b([a-z]+)(?: .*)?(?<![/|/ ])>#iU', $content, $result);
    $openedtags = $result[1];
  preg_match_all('#</([a-z]+)>#iU', $content, $result);
 $closedtags = $result[1];
  $len_opened = count($openedtags);
  if (count($closedtags) == $len_opened) {
       return $content;
  }
  $openedtags = array_reverse($openedtags);
  for ($i=0; $i < $len_opened; $i++) {
        if (!in_array($openedtags[$i], $closedtags)) {
         $content .= '</'.$openedtags[$i].'>';
       } else {
           unset($closedtags[array_search($openedtags[$i], $closedtags)]);
        }
  }
  return $content;
}
...

the TPL:

{$pages[j].text|truncate:300|@closetags}
A: 

Pull out all the open tags, push them into an array (array_1) one-by-one.

Pull out all of the closed tags, push them into an array (array_2) one-by-on (this includes self closing tags).

For the tags in the first array (array_1) that are not found in the second array (array_2), add them to the html.

[edit]

Of course, this method fails miserably if you do not write proper html... but whatchagonnado?

Another way would be to look ahead in the string to see which tags are closed and close them as needed.

SeanJA
I'm testing it now. :)
neduddki
A: 

To simplify, if the code is valid XML before truncating and you don't cut off tags in half, the algorithm would be something like this:

  • Push opening tags onto a stack
  • Pop them off when you find the closing tag (which will match if the code is valid)
  • When you get to the end, start popping to create closing. The remaining tags should be appended to the original (truncated) text.

Example:

<div>Lorem <i>ipsum <b>dolor <span>sit </span>amet</b><div>
  • Push "div","i","b","span"
  • Found closing tag "span"
  • Pop "span"
  • Found closing tag "b"
  • Pop "b"
  • Push "div"
  • End of truncated text
  • Pop "div" --> add </div> to text
  • Pop "b" --> add </b> to text
  • Pop "i" --> add </i> to text
  • Pop "div" --> add </div> to text
  • End
robertos
I think you mean: **xhml**
SeanJA
@SeanJA, valid *xhtml* is also valid xml.
tloflin
@SeanJA @tloflin is correct - this method works for any XML, including XHTML. HTML doesn't force you to close tags in many cases (ex. <img>, <li>), so an algorithm wouldn't have to close those.
robertos
@robertos Fair enough, I personally don't consider unclosed html to be proper though.
SeanJA