views:

2855

answers:

4

I am consuming the Twitter API and want to convert all URLs to hyperlinks.

What is the most effective way you've come up with to do this?

from

string myString = "This is my tweet check it out http://tinyurl.com/blah";

to

This is my tweet check it out <a href="http://tinyurl.com/blah"&gt;http://tinyurl.com/&gt;blah&lt;/a&gt;
+7  A: 

Regular expressions are probably your friend for this kind of task:

Regex r = new Regex("(http://[^ ]+)");
myString = r.Replace(myString, "<a href=\"$1\">$1</a>");

The regular expression for matching URLs might need a bit of work.

samjudson
I think that's fine, regular expressions are powerful, yet capturing while non-whitespace is a lot better than trying to implement an URL parser in regex. I would maybe change it to `(https?://[^ ]+)` becuase https is not that uncommon.
John Leidegren
+3  A: 

This is actually an ugly problem. URLs can contain (and end with) punctuation, so it can be difficult to determine where a URL actually ends, when it's embedded in normal text. For example:

http://example.com/.

is a valid URL, but it could just as easily be the end of a sentence:

I buy all my witty T-shirts from http://example.com/.

You can't simply parse until a space is found, because then you'll keep the period as part of the URL. You also can't simply parse until a period or a space is found, because periods are extremely common in URLs.

Yes, regex is your friend here, but constructing the appropriate regex is the hard part.

Check out this as well: Expanding URLs with Regex in .NET.

Derek Park
That was perfect.. :)
TimLeung
I've been buying mine from http://tempuri.org/.
eed3si9n
+4  A: 

I did this exact same thing with jquery consuming the JSON API here is the linkify function:

String.prototype.linkify = function() {
    return this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:%&\?\/.=]+/, function(m) {
     return m.link(m);
    });
 };
RedWolves
+1  A: 

/cheer for RedWolves

from: this.replace(/[A-Za-z]+:\/\/[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?\/.=]+/, function(m){...

see: /[A-Za-z]+:\/\/[A-Za-z0-9-]+.[A-Za-z0-9-:%&\?\/.=]+/

There's the code for the addresses "anyprotocal"://"anysubdomain/domain"."anydomainextension and address",

and it's a perfect example for other uses of string manipulation. you can slice and dice at will with .replace and insert proper "a href"s where needed.

I used jQuery to change the attributes of these links to "target=_blank" easily in my content-loading logic even though the .link method doesn't let you customize them.

I personally love tacking on a custom method to the string object for on the fly string-filtering (the String.prototype.linkify declaration), but I'm not sure how that would play out in a large-scale environment where you'd have to organize 10+ custom linkify-like functions. I think you'd definitely have to do something else with your code structure at that point.

Maybe a vet will stumble along here and enlighten us.

winfred