ansaurus

Question

Python - Combining a url with urlunparse

Answer 1

+1 A:

Problem is that in parsing the very incomplete URL www.python.org, the string you give is actually taken as the path component of the URL, with the netloc (network location) one being empty as well as the scheme. For defaulting the scheme you can actually pass a second parameter scheme to urlparse (simplifying your logic) but that does't help with the "empty netloc" problem. So you need some logic for that case, e.g.

if not netloc:
    netloc, path = path, ''

Alex Martelli 2010-09-26 14:55:10

That makes perfect sense, it's assuming that the netloc exists as it's an empty string and concatenating the extra / that should be there. Your solution works!Thanks for the quick response.

Ben 2010-09-26 15:00:46

@Ben, you're welcome!

Alex Martelli 2010-09-26 15:09:48

@Ben, you should click the checkmark to the left of this answer to mark it as accepted =)

katrielalex 2010-09-26 15:44:47

Answer 2

A:

It's because urlparse is interpreting "www.python.org" not as the hostname (netloc), but as the path, just as a browser would if it encountered that string in an href attribute. Then urlunparse seems to interpret scheme "http" specially. If you put in "x" as the scheme, you'll get "x:www.python.org".

I don't know what range of inputs you're dealing with, but it looks like you might not want urlparse and urlunparse.

Ned Batchelder 2010-09-26 14:56:17

ansaurus

tags:

views:

answers:

Python - Combining a url with urlunparse

related questions