A friend is writing an advertisement script that puts links around select phrases in HTML code.
Naturally if the phrase is already inside an <a>
element (or another element that doesn't allow it - like if the phrase is found in the attribute of an element), he doesn't want the script to write out a link as it would break validation.
He asked me what I thought. After some bumbling around, I'm asking you all what you think.
Just to clarify, the input is a whole blog post in HTML. Example:
<p>This is a short blog post about ponies!</p>
<p>I have <a href="/ponies">written about ponies before</a>.</p>
<p><img src="/media/ponies.jpg" /></p>
For this example, say I want to replace ponies
(any case) with <a href="http://www.ponies.com">ponies</a>
(but with the original case).
The output from above should read:
<p>This is a short blog post about <a href="http://www.ponies.com">ponies</a>!</p>
<p>I have <a href="/ponies">written about ponies before</a>.</p>
<p><img src="/media/ponies.jpg" /></p>
We don't need full code but good ideas/regexes are immensely welcome. He's writing this in PHP but language-neutral is fine.