Python urlparse, correct or incorrect?

views:

138

answers:

Python urlparse, correct or incorrect?

Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)

Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".

Should't it be netloc = "example.com" and path = "/path/file.ext"?

Do we really need a "://" to determine wether or not a netloc exists?

Python's ticket: http://bugs.python.org/issue8284

+1 A:

example.com/path/file.ext is not URL. It's just some string. For example if you put <a href="example.com/path/file.ext"> into HTML page, it will not link to http://example.com/path/file.ext. It's just a shortcut provided by web browsers that you do not have to prepend the http://. You can not even use such URL as parameter for urllib2.urlopen() and similar functions.

Messa 2010-04-01 22:05:58

but then you could have something like this <base href="http://"> and then something like <a href="example.com/path/file.ext">example</a> would be correct

Ben 2010-04-01 22:23:35

+2 A:

Without the scheme://, there's no guarantee that example.com is a domain. You could have a directory called example.com. Similarly, you could have a url 'omfgroflmao/path/file.ext', how would you know if 'omfgroflmao' is a machine on the local machine (i.e. a netloc) or whether it's meant to be a path component?

I can't see that the Python code is actually wrong, but perhaps the documentation needs to spell out explicitly the behaviour in such ambiguous circumstances (I haven't checked).

Vinay Sajip 2010-04-01 22:06:01

ansaurus

tags:

views:

answers:

Python urlparse, correct or incorrect?

related questions