views:

106

answers:

4

Hi. I'm currently developing a little browser-based Twitter widget.

Currently, I'm stuck with getting the URLs to work. I'm kinda newbie, when it comes to regex (I know, how to get parts of a string, but this one – tough one).

So, I need a regex that would search/replace

www.domain.tld -> <a href="http://www.domain.tld"&gt;http://www.domain.tld&lt;/a&gt;

With/without http://, preferably.

Any advice is welcome. Thanks.

A: 

This is how far I've got:

www\.(?:\S*)\.(?:\S{2,3})

It checks for www. at beginning, any non-witespace chars and top level domain (2 or three chars).

Kristaps
.info? .mobi? .museum? You should probably check for that.
Tim Cooper
Kristaps
A: 

I'm in an ever going war against RegExes, I don't like them. So, do I'd do it like this instead:

function get_domain_from_anchor($anchor, $delimiter = '"') {
    return substr(strstr(strstr($anchor, $delimiter), $delimiter.'>', true), 8);
}

echo get_domain_from_anchor('<a href="http://www.domain.net"&gt;http://www.domain.net&lt;/a&gt;');

// OUTPUTS: www.domain.net

Much better :D

Sune Rasmussen
I'm sorry, but I need the opposite thing. I have to convert from plaintext URLs to html anchors.
Kristaps
@Kristaps: Your question is a little unclear, what that regards ;)
Sune Rasmussen
A: 

I believe this is exactly what you're looking for: http://stackoverflow.com/questions/206059/php-validation-regex-for-url

Some more information regarding extraction of URLs: http://stackoverflow.com/questions/910912/extract-urls-from-text-in-php

Coding District
Kristaps
A: 

Try twitter-text-php. It is ported to PHP from the official Twitter code.

From the README file:

$autolinker = new Twitter_Autolink();
$html = $autolinker->autolink("Tweet mentioning @mikenz and refuring to his list @mikeNZ/sports and website http://mikenz.geek.nz");
echo $html;
mcrumley