A: 
import urlparse
p = urlparse.urlsplit("john.doe")


=> ('','','john.doe','','')

The first element of the tuple should be "http://", the second element of the tuple should be "www.facebook.com/", and you can leave the fourth and fifth elements of the tuple alone. You can then reassemble your URL after processing it.

Just an FYI, to ensure a safe url segment for 'john.doe' (this may not apply to facebook, but its a good rule to know) use urllib.quote(string) to properly escape whitespace, etc.

Chris G.
A: 

Hello Jackie, I am not very sure if I understood what you asked, but you can try this code, I tested and works fine but If you have trouble with this let me know.

I hope it helps

! /usr/bin/env python

import urlparse

def clean_url(url):

url_list = [] 
# split values into tuple
url_tuple = urlparse.urlsplit(url)

# as tuples are immutable so take this to a list
# so we can change the values that we need
counter = 0
for element in url_tuple:
    url_list.append(element)

# validate each element individually
url_list[0] = 'http'
url_list[1] = 'www.facebook.com'

# get user name from the original url
# ** I understood the user is the only value
# for sure in the url, right??
user = url.split('/')
if len(user) == 1:
    # the user was the only value sent
    url_list[2] = user[0]
else:
    # get the last element of the list
    url_list[2] = user[len(user)-1]

# convert the list into a tuple and
# get all the elements together in the url again
new_url = urlparse.urlunsplit(tuple(url_list))       

return new_url    

if name == 'main': print clean_url("http://www.facebook.com/john.doe") print clean_url("http://facebook.com/john.doe") print clean_url("facebook.com/john.doe") print clean_url("www.facebook.com/john.doe") print clean_url("john.doe")

Carlos Balderas
+1  A: 

I know this answer is a little late to the party, but if this is exactly what you're trying to do, I recommend a slightly different approach. Rather than reinventing the wheel for canonicalizing facebook urls, consider using the work that Google has already done for use with their Social Graph API.

They've already implemented patterns for a number of similar sites, including facebook. More information on that is here:

http://code.google.com/p/google-sgnodemapper/

Paul McMillan