tags:

views:

223

answers:

3

I have a div id like to remove from an output which looks like

<div id="ithis" class="cthis">Content here which includes other elements etc..) </div>

How can I remove this div and everything within it using PHP and regex?

Thank you.

+6  A: 

The simple answer is that you don't. You use one of PHP's many HTML parsers instead. Regexes are a flaky and error-prone way of manipulating HTML.

That being said you can do this:

$html = preg_replace('!<div\s+id="ithis"\s+class="cthis">.*?</div>!is', '', $html);

But many things can wrong with this. For example, if that contains a div:

<div id="ithis" class="cthis">Content here which <div>includes</div> other elements etc..) </div>

you'll end up with:

 other elements etc..) </div>

as the regex will stop at the first </div>. And no there's nothing you can really do to solve this problem (with regular expressions) consistently.

Done with a parser it looks more like this:

$doc = new DOMDocument();
$doc->loadHTML($html);
$element = $doc->getElementById('ithis');
$element->parentNode->removeChild($element);
$html = $doc->saveHTML();
cletus
A: 

I don't know about PHP, but you can replace /<id.*?<\/id[^>]*>/ with nothing.

Daniel
A: 

PHP is server side, and the output is coming from the server. Can't you just not output it? Or are you trying to hide it? If so, in a stylesheet, just say #ithis {display:none}.

If the string is a return from some function in PHP that you haven't written AND you don't want to muck with that code, you have to write a very difficult regex to account for nested div's, varying syntax in the output, etc. I'd recommend using some parser (perhaps this Zend Framework component) to help you out. I've used it a few times for something similar. Although if you're not familiar with ZF at all, you may want to try something else.

Keith Bentrup