views:

48

answers:

3

I have regexp for twitter profile url and someone's twitter profile url. I can easily extract username from url.

>>> twitter_re = re.compile('twitter.com/(?P<username>\w+)/')
>>> twitter_url = 'twitter.com/dir01/'
>>> username = twitter_re.search(twitter_url).groups()[0]
>>> _
'dir01'

But if I have regexp and username, how do I get url?

A: 

If you are not looking for a general solution to convert any regex into a formatting string, but something that you can hardcode:

twitter_url = 'twitter.com/%(username)s/' % {'username': 'dir01'}

...should give you what you need.

If you want a more general (but not incredibly robust solution):

import re

def format_to_re(format):
    # Replace Python string formatting syntax with named group re syntax.
    return re.compile(re.sub(r'%\((\w+)\)s', r'(?P<\1>\w+)', format))

twitter_format = 'twitter.com/%(username)s/'
twitter_re = format_to_re(twitter_format)

m = twitter_re.search('twitter.com/dir01/')
print m.groupdict()
print twitter_format % m.groupdict()

Gives me:

{'username': 'dir01'}
twitter.com/dir01/

And finally, the slightly larger and more complete solution that I have been using myself can be found in the Pattern class here.

Mike Boers
But I am looking for general solution
dir01
@dir01: I have added a couple of more general solutions. The final one may be complete overkill but it may do what you want.
Mike Boers
Generating regexp from string formatting! So cute! )))
dir01
@dir01: You could also easily modify it do match the different formatting options properly...
Mike Boers
A: 

Why do you need the regex for that - just append the strings.

base_url = "twitter.com/"
twt_handle = "dir01"
twit_url = base_url + twt_handle
Amarghosh
+1  A: 

Regexen are no two-way street. You can use them for parsing strings, but not for generating strings back from the result. You should probably look into another way of getting the URLs back, like basic string interpolation, or URI templates (see http://code.google.com/p/uri-templates/)

wvanbergen
That’s not true – regular expressions are strictly a shorthand form of regular generative grammars which, as the name says, are used to *generate* strings belonging to a language. And while most regex engines only support parsing, there are other libraries that support generation.
Konrad Rudolph
So, what I want is just impossibble, allright )
dir01
@Konrad Rudolph: Though with a generation rule as narrow as "twitter URLs from usernames" such a library would certainly be the wrong approach. ;-)
Tomalak
@dir01: You *really* should explain what it is that you want. I am sure that what you want and what you ask here are two very different things.
Tomalak
@tomalak what I try to do is a clone of django-elsewhere, django app for providing info about other web services user is registered on. Users should be able to provide either username or url, also some sites give more than one url (example.com/id(\d+) or coolname.example.com )
dir01
@dir01: To me that looks like you need string interpolation, in the form of `"<placeholder>.example.com"`, just like @Mike Boers suggested.
Tomalak