My application creates custom URIs (or URLs?) to identify objects and resolve them. The problem is that Python's urlparse module refuses to parse unknown URL schemes like it parses http.
If I do not adjust urlparse's uses_* lists I get this:
>>> urlparse.urlparse("qqqq://base/id#hint")
('qqqq', '', '//base/id#hint', '', '', '')
>>> url...
Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)
Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".
Should't it be netloc = "example.com" and path = "/path/file.ext"?
Do we really need a "://" to determine wether or not a ...
hello,
i am writting an small crawler that extract some 5 to 10 sites while getting the links i am getting some urls like this
../tets/index.html
if it is /test/index.html we can add with base url http://www.example.com/test/index.html
what can i do for this kind of urls.
...
Hi I'm new to Python so forgive me if this seems a little obvious but I can't see that it's been asked before.
I'm writing something to 'clean' a URL. In this case all I'm trying to do is return a faked scheme as urlopen won't work without one. However, if I test this with 'www.python.org' It'll return http:///www.python.org. Does anyon...