Use a DOM Parser to find all text nodes that contain the needle and which do not have a a parent element with a name of "a":
$html = <<< HTML
<p>
. text2 text2 word text2...
<a href="something">text text word <span> word </span> text</a>
. text2 text2 word text2...
<p>
HTML;
Code:
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$nodes = $xp->query('//*[name() != "a"]/text()[contains(.,"word")]');
foreach($nodes as $node) {
// can use a Regex in here too if you are after word boundaries
$node->nodeValue = str_replace('word', 'something', $node->nodeValue);
}
echo $dom->saveXML($dom->documentElement);
Outputs:
<html><body><p>
. text2 text2 something text2...
<a href="something">text text word <span> something </span> text</a>
. text2 text2 something text2...
</p><p/></body></html>
Note how this will also replace word inside the span inside the a. If you want to exclude those too, you have to adjust the XPath to:
'//text()[not(ancestor::a) and contains(., "word")]'
to find all text nodes containing the needle that are not nested anywhere inside an a element.
There is a number of third party parsers worth mentioning that aim to enhance DOM: phpQuery, Zend_Dom, QueryPath and FluentDom.