but sometimes I have the problem that it is replaced twice. So I want to avoid this problem.
Specify 1
as the fourth argument to your preg_replace
call and it will only perform the replacement once.
For example...
preg_replace
( '/test/i'
, '<a class="dfn" href="glossary.php?term=\0">\0</a>'
, $input
, 1
);
More details on the preg_replace man page.
In response to:
Consider this one:
This is my text. In this part also the text should be replaced. However in this part this <a href="#text">text</a> should not be replaced.
In this text two matches from the text should be replaced the third match no because it is an anchor.
This is not a problem for regex, but one that requires parsing the HTML and traversing the DOM in order to run your replace on text nodes not inside a link.
For example, here is some pseudo-code:
MyHtml = parseHtmlDom( Input ); // turn HTML text into a HTML DOM.
SpecialWords = 'comma,delimited,text';
linkify(MyHtml.ChildNodes); // kick off the recursive function.
function linkify(HtmlDomNodes,SpecialWords)
{
for (CurNode in HtmlDomNodes) // Loop through each node.
{
if (CurNode.is('a')) continue; // if hyperlink, skip to next node in loop.
CurNode.Text = addLinks( CurNode.Text , SpecialWords ); // perform the regex replacements to add the hyperlinks.
linkify(CurNode.ChildNodes); // recursively call function until all text is replaced.
}
}
function addLinks(Text,SpecialWords)
{
for (CurWord in SpecialWords.split(','))
{
Text = preg_replace
( '/'&CurWord&'/i'
, '<a class="dfn" href="glossary.php?term=\0">\0</a>'
, Text
);
}
return Text;
}
It's not PHP because I can't be bothered reading the documentation for a PHP DOM parser to work out the correct commands - you can do that bit. :)
As for which parser to use PHP Simple HTML DOM Parser is one option, but there are others if you don't get on with that.