views:

45

answers:

1

Hello all, first post woo!

I have an html page loaded into a PHP variable and am using str_replace to change certain words with other words. The only problem is that if one of these words appears in an important peice of code then the whole thing falls to bits.

Is there any way to only apply the str_replace function to certain html tags? Particularly: p,h1,h2,h3,h4,h5

EDIT:

The bit of code that matters:

 $yay = str_ireplace($find, $replace , $html); 

cheers and thanks in advance for any answers.

EDIT - FURTHER CLARIFICATION:

$find and $replace are arrays containing words to be found and replaced (respectively). $html is the string containing all the html code.

a good example of it falling to bits would be if I were to find and replace a word that occured in e.g. the domain name. So if I wanted to replace the word 'hat' with 'cheese'. Any occurance of an absolute path like

www.worldofhat.com/images/monkey.jpg would be replaced with: www.worldofcheese.com/images/monkey.jpg

So if the replacements could only occur in certain tags, this could be avoided.

+1  A: 

Do not treat the HTML document as a mere string. Like you already noticed, tags/elements (and how they are nested) have meaning in an HTML page and thus, you want to use a tool that knows what to make of an HTML document. This would be DOM then:

Here is an example. First some HTML to work with

$html = <<< HTML
<body>
    <h1>Germany reached the semi finals!!!</h1>
    <h2>Germany reached the semi finals!!!</h2>
    <h3>Germany reached the semi finals!!!</h3>
    <h4>Germany reached the semi finals!!!</h4>
    <h5>Germany reached the semi finals!!!</h5>
    <p>Fans in Germany are totally excited over their team's 4:0 win today</p>
</body>
HTML;

And here is the actual code you would need to make Argentina happy

$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$nodes = $xpath->query('//*[self::h1 or self::h2 or self::p]');
foreach( $nodes as $node ) {
    $node->nodeValue = str_replace('Germany', 'Argentina', $node->nodeValue);
}
echo $dom->saveHTML();

Just add the tags you want to replace content in the XPath query call. An alternative to using XPath would be to use DOMDocument::getElementsByTagName, which you might know from JavaScript:

 $nodes = $dom->getElementsByTagName('h1');

In fact, if you know it from JavaScript, you might know a lot more of it, because DOM is actually a language agnostic API defined by the W3C and implemented in many languages. The advantage of XPath over getElementsByTagName is obviously that you can query multiple nodes in one go. The drawback is, you have to know XPath :)

Gordon
Epic answer, thanks for your help. If you're ever in London i'll buy you a guinness!
DrShamoon
@DrShamoon You're welcome. I'll buy you the second then for compensating Germany having thrown out the English team as well ;)
Gordon
We don't like to talk about that :@( haha thanks again mate.
DrShamoon