urllib

How would one log into a phpBB3 forum through a Python script using urllib, urllib2 and ClientCookie?

(ClientCookie is a module for (automatic) cookie-handling: http://wwwsearch.sourceforge.net/ClientCookie) # I encode the data I'll be sending: data = urllib.urlencode({'username': 'mandark', 'password': 'deedee'}) # And I send it and read the page: page = ClientCookie.urlopen('http://www.forum.com/ucp.php?mode=login', data) output = pa...

urllib.urlopen works but urllib2.urlopen doesn't

I have a simple website I'm testing. It's running on localhost and I can access it in my web browser. The index page is simply the word "running". urllib.urlopen will successfully read the page but urllib2.urlopen will not. Here's a script which demonstrates the problem (this is the actual script and not a simplification of a differe...

How to unquote a urlencoded unicode string in python?

I have a unicode string like "Tanım" which is encoded as "Tan%u0131m" somehow. How can i convert this encoded string back to original unicode. Apparently urllib.unquote does not support unicode. ...

How do I get data from stdin using os.system()

The only reliable method that I a have found for using a script to download text from wikipedia is with cURL. So far the only way I have for doing that is to call os.system(). Even though the output appears properly in the python shell I can't seem to the function it to return anything other than the exit code(0). Alternately somebody co...

Download from EXPLOSM.net Comics Script [Python]

So I wrote this short script (correct word?) to download the comic images from explosm.net comics because I somewhat-recently found out about it and I want to...put it on my iPhone...3G. It works fine and all. urllib2 for getting webpage html and urllib for image.retrieve() Why I posted this on SO: how do I optimize this code? Would RE...

How to download a file over http with authorization in python 3.0, working around bugs?

I have a script that I'd like to continue using, but it looks like I either have to find some workaround for a bug in Python 3, or downgrade back to 2.6, and thus having to downgrade other scripts as well... Hopefully someone here have already managed to find a workaround. The problem is that due to the new changes in Python 3.0 regard...

Python 3.0 urllib.parse error "Type str doesn't support the buffer API"

File "/usr/local/lib/python3.0/cgi.py", line 477, in __init__ self.read_urlencoded() File "/usr/local/lib/python3.0/cgi.py", line 577, in read_urlencoded self.strict_parsing): File "/usr/local/lib/python3.0/urllib/parse.py", line 377, in parse_qsl pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')] TypeError: T...

Python error when using urllib.open

When I run this: import urllib feed = urllib.urlopen("http://www.yahoo.com") print feed I get this output in the interactive window (PythonWin): <addinfourl at 48213968 whose fp = <socket._fileobject object at 0x02E14070>> I'm expecting to get the source of the above URL. I know this has worked on other computers (like the ones ...

Using Urllib with TOR

How can I route urllib requests through the TOR network? I have not been able to find any decent examples on the internet, can anyone help me? ...

What is the best way to decompress a gzip'ed server response in Python 3?

I had expected this to work: >>> import urllib.request as r >>> import zlib >>> r.urlopen( r.Request("http://google.com/search?q=foo", headers={"User-Agent": "Mozilla/5.0 (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11", "Accept-Encoding": "gzip"}) ).read() b'af0\r\n\x1f\x8b\x08...(long binary string)' >>> zlib.decompress(_) Traceb...

Python: Downloading a large file to a local path and setting custom http headers

I am looking to download a file from a http url to a local file. The file is large enough that I want to download it and save it chunks rather than read() and write() the whole file as a single giant string. The interface of urllib.urlretrieve is essentially what I want. However, I cannot see a way to set request headers when downloadin...

Python/urllib suddenly stops working properly

I'm writing a little tool to monitor class openings at my school. I wrote a python script that will fetch the current availablity of classes from each department every few minutes. The script was functioning properly until the uni's site started returning this: SIS Server is not available at this time Uni must have blocked my server...

Python interface to PayPal - urllib.urlencode non-ASCII characters failing

I am trying to implement PayPal IPN functionality. The basic protocol is as such: The client is redirected from my site to PayPal's site to complete payment. He logs into his account, authorizes payment. PayPal calls a page on my server passing in details as POST. Details include a person's name, address, and payment info etc. I need t...

Turning on debug output for python 3 urllib

In python 2, it was possible to get debug output from urllib by doing import httplib import urllib httplib.HTTPConnection.debuglevel = 1 response = urllib.urlopen('http://example.com').read() However, in python 3 it looks like this has been moved to http.client.HTTPConnection.set_debuglevel(level). However, I'm using urllib not http....

Form Submission in Python Without Name Attribute

Background: Using urllib and urllib2 in Python, you can do a form submission. You first create a dictionary. formdictionary = { 'search' : 'stackoverflow' } Then you use urlencode method of urllib to transform this dictionary. params = urllib.urlencode(formdictionary) You can now make a url request with urllib2 and pass the var...

Python: get http headers from urllib call?

does urllib fetch the whole page? when a urlopen call is made? I'd like to just read the http response header without getting the page it looks like urllib opens the http connection and then subsequently gets the actual html page... or does it just start buffering the page with the url open call? import urllib2 myurl = 'http://bit.ly...

Most memory efficient way to save binary file from the web with Python 2.6?

I'm trying to download (and save) a binary file from the web using Python 2.6 and urllib. As I understand it, read(), readline() and readlines() are the 3 ways to read a file-like object. Since the binary files aren't really broken into newlines, read() and readlines() read teh whole file into memory. Is choosing a random read() buffer...

How to know if urllib.urlretrieve succeeds?

urllib.urlretrieve returns silently even if the file doesn't exist on the remote http server, it just saves a html page to the named file. For example: urllib.urlretrieve('http://google.com/abc.jpg', 'abc.jpg') just returns silently, even if abc.jpg doesn't exist on google.com server, the generated abc.jpg is not a valid jpg file, it'...

Unicode problem Django-Python-URLLIB-MySQL

I am fetching a webpage (http://autoweek.com) and trying to process it but getting encoding error. Autoweek declares "iso-8859-1" encoding and has the word "Nürburgring" (u with umlaut) I do: # -*- encoding: utf-8 -*- import urllib webpage = urllib.urlopen(feed.crawl_url).read() webpage.decode("utf-8") it gives me the following err...

python urllib, how to watch messages?

How can I watch the messages being sent back and for on urllib shttp requests? If it were simple http I would just watch the socket traffic but of course that won't work for https. Is there a debug flag I can set that will do this? import urllib params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) f = urllib.urlopen("https://...