tags:

views:

133

answers:

2

I am new to regex in PHP and understand the basic patterns however the ones below are a bit complex and I don't understand what the following pattern matches:

$ret = preg_replace("#(^|[\n ])([\w]+?://[\w\#$%&~/.\-;:=,?@\[\]+]*)#... "<a href='' rel='nofollow'></a>", $ret);

$ret = preg_replace("#(^|[\n ])((www|ftp)\.[\w\#$%&~/.\-;:=,?@\[\]+]*... "<a href='http://' rel='nofollow'></a>", $ret);

Could someone please explain them?

Thanks.

+2  A: 

Get RegexBuddy, and it explains you (see screenshots) what any regular expression means. There is another anwser here in SO that demonstrates that.

Anyway, according to the second arguments of the preg_replaces, they should match URLs and tagify them.

Török Gábor
+3  A: 

In short: Replace URLs by links.

In detail:

  1. The first regex describes sequences that begin with word characters ([\w]+), followed by ://, followed by one or more characters of the set [\w\#$%&~/.\-;:=,?@\[\]+].

    That should probably match a URL beginning with the URL protocol/scheme like http://, https:// or ftp://.

    But it would also match javascript://. And that’s not good: javascript://%0Aalert%28%22booo%21%22%29 equals the JavaScript code:

    //
    alert("booo!")
    
  2. The second regex describes sequences that begin with either www. or ftp., again followed by one or more characters of the set [\w\#$%&~/.\-;:=,?@\[\]+].

    That should probably match URLs, that just begin with www. or ftp.. The URL protocol/scheme is then added to the URL.

Gumbo