tags:

views:

65

answers:

2

I use this function to make URLs to clickable links but the problem is that when there is some Unicode character in the URL it becomes clickable links only before that character...

Function:

function clickable($text) {
$text = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
'<a class="und" href="\\1">\\1</a>', $text);
$text = eregi_replace('([[:space:]()[{}])(www.[-a-zA-Z0-9@:%_\+.~#?&//=]+)',
'\\1<a href="http://\\2"&gt;\\2&lt;/a&gt;', $text);
$text = eregi_replace('([_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3})',
'<a href="mailto:\\1">\\1</a>', $text);

return $text;

}

How to fix this problem?

A: 

Try using \p{L} instead of a-zA-Z and \p{Ll} instead of a-z

You can find details of unicode handling in regular expressions here

And get in the habit of using the preg functions rather than the deprecated ereg functions

Mark Baker
+1  A: 

First of all, don't use eregi_replace. I don't think it's possible to use it with unicode - and it's depreciated from php 5.3. Use preg_replace.

Then you can try something like that

preg_replace("/(https?|ftps?|mailto):\/\/([-\w\p{L}\.]+)+(:\d+)?(\/([\w\p{L}\/_\.#]*(\?\S+)?)?)?/u", '<a href="$0">$0</a>

EDIT - updated expression to include # character

Tomasz Struczyński
Works fine, thanks!!!
Levani
One problem, it breaks when there is # symbol in url! How to fix that?
Levani
Updated answer.
Tomasz Struczyński