views:

138

answers:

2

Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)

Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".

Should't it be netloc = "example.com" and path = "/path/file.ext"?

Do we really need a "://" to determine wether or not a netloc exists?

Python's ticket: http://bugs.python.org/issue8284

+1  A: 

example.com/path/file.ext is not URL. It's just some string. For example if you put <a href="example.com/path/file.ext"> into HTML page, it will not link to http://example.com/path/file.ext. It's just a shortcut provided by web browsers that you do not have to prepend the http://. You can not even use such URL as parameter for urllib2.urlopen() and similar functions.

Messa
but then you could have something like this <base href="http://"> and then something like <a href="example.com/path/file.ext">example</a> would be correct
Ben
+2  A: 

Without the scheme://, there's no guarantee that example.com is a domain. You could have a directory called example.com. Similarly, you could have a url 'omfgroflmao/path/file.ext', how would you know if 'omfgroflmao' is a machine on the local machine (i.e. a netloc) or whether it's meant to be a path component?

I can't see that the Python code is actually wrong, but perhaps the documentation needs to spell out explicitly the behaviour in such ambiguous circumstances (I haven't checked).

Vinay Sajip