tags:

views:

1223

answers:

2

How can I parse text and find all instances of hyperlinks with a string? The hyperlink will not be in the html format of test but just http://test.com

Secondly, I would like to then convert the original string and replace all instances of hyperlinks into clickable html hyperlinks.

I found an example in this thread:

http://stackoverflow.com/questions/32637/easiest-way-to-convert-a-url-to-a-hyperlink-in-a-c-string

but was unable to reproduce it in python :(

+8  A: 

Here's a Python port of http://stackoverflow.com/questions/32637/easiest-way-to-convert-a-url-to-a-hyperlink-in-a-c-string/32648#32648:

import re

myString = "This is my tweet check it out http://tinyurl.com/blah"

r = re.compile(r"(http://[^ ]+)")
print r.sub(r'<a href="\1">\1</a>', myString)

Output:

This is my tweet check it out <a href="http://tinyurl.com/blah"&gt;http://tinyurl.com/blah&lt;/a&gt;
maxyfc
Thanks just the tip to get me started! Let me give it a try and understand it...
TimLeung
You are welcome.
maxyfc
It can be improved by adding support for https or ftp URLs... Also, I believe the scheme (http) is case-INsensitive.
bortzmeyer
A: 

Here is a much more sophisticated regexp from 2002.

dfrankow