tags:

views:

455

answers:

1

I'm trying to substitute something in a string in python and am having some trouble. Here's what I'd like to do.

For a given comment in my posting:

"here are some great sites that i will do cool things with! http://stackoverflow.com/it's a pig & http://google.com"

I'd like to use python to make the strings like this:

"here are some great sites that i will do cool things with! <a href="http://stackoverflow.com"&gt;http%3A//stackoverflow.com&lt;/a&gt; &amp; <a href="http://google.com"&gt;http%3A//google.com&lt;/a&gt;

Here's what I have so far...

import re
import urllib

def getExpandedURL(url)
    encoded_url = urllib.quote(url)
    return "<a href=\"<a href="+url+"\">"+encoded_url+"</a>"

text = '<text from above>'
url_pattern = re.compile('(http.+?[^ ]+', re.I | re.S | re.M)
url_iterator = url_pattern.finditer(text)
for matched_url in url_iterator:
    getExpandedURL(matched_url.groups(1)[0])

But this is where i'm stuck. I've previously seen things on here like this: Regular Expressions but for Writing in the Match but surely there's got to be a better way than iterating through each match and doing a position replace on them. The difficulty here is that it's not a straight replace, but I need to do something specific with each match before replacing it.

+3  A: 

I think you want url_pattern.sub(getExpandedURL, text).

re.sub(pattern, repl, string, count=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a callable, it's passed the match object and must return a replacement string to be used.

Darius Bacon
i think this is the winner! i'll try it out right now
aronchick