views:

85

answers:

2

I want to have a function which gets a text as the input and gives back the text with URLs made to HTML links as the output.

My draft is as follows:

function autoLink($text) {
    return preg_replace('/https?:\/\/[\S]+/i', '<a href="\0">\0</a>', $text);
}

But this doesn't work properly.

For the input text which contains ...

http://www.google.de/

... I get the following output:

<a href="http://www.google.de/&lt;br"&gt;http://www.google.de/&lt;br&lt;/a&gt; />

Why does it include the line breaks? How could I limit it to the real URL?

Thanks in advance!

+3  A: 

Well, < is not a whitespace character, so it is matched by [\S]. You can exclude it from your set of accepted characters:

preg_replace('/https?:\/\/[^\s<]+/i', '<a href="\0">\0</a>', $text);
Heinzi
Thank you, this works fine for the given problem. But it would be perfect if other characters like " and > would also be excluded. Do I have to write them into the first [] ?
@marco92w: Exactly. Everything inside `[^...]` will be excluded.
Heinzi
+1  A: 

How about using Gruber's URL Regex?

\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
Alix Axel
Thanks, interesting page. But for my purpose, the regex from above is enough. What are the advantages of your regex?
@marco92w: The regex is not mine. It's from the same guy who "invented" markdown. Your regex for instance won't autolink `ftp[s]://` or `www.*` (no protocol). Read this and test it: http://daringfireball.net/2009/11/liberal_regex_for_matching_urls
Alix Axel
Thank you very much, now I understand all the advantages.