views:

262

answers:

1

I have a block of code that will take a block of text like the following:

Sample text sample text http://www.google.com sample text

Using the preg_replace_callback method and the following regular expression:

preg_replace_callback('/http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)/',
    create_function(
        '$matches',
        '$url = $matches[1]; 
        $anchorText = ( strlen($url) > 35 ? substr($url, 0, 35).\'...\' : $url); 
        return \'<a href="http://\'. $url .\'">\'. $anchorText .\'</a>\';'),
    $str);

Will convert the sample text to look like:

Sample text sample text < a href="http://www.google.com">http://www.google.com&lt; /a> sample text

My problem now is that we have introduced a rich text editor that can create links before being sent to the script. I need to update this piece of code so that it will ignore any URLs that are already inside an tag.

A: 

Add code to the beginning of the pattern to capture an opening anchor tag, and then do not perform the callback code when it has captured something:

/(<a[^>]*>)?http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)/

You will then need to add an if to your lamda function to see if there is anything in $matches[1] (Don't forget to increment your captures too)

You cannot use a negative look behind assertion here as the capture is not a fixed length, but you could use a negative look ahead assertion for the closing tag so it drops the entire match:

/(<a[^>]*>)?http:\/\/([,\%\w.\-_\/\?\=\+\&\~\#\$]+)(?!</a>)/
Will Earp
Your first expression correctly matches, and by just returning $matches[0] when $matches[1] is not blank, I can work around the issue easily. Your second expression however returns: Unknown modifier 'a'
tombazza
sorry I forgot to slash the / in </a>, so that will need to be <\/a> otherwise it thinks it is ending the pattern and that a is a modifier
Will Earp