I am using a proxy and following is the code.
20 req = urllib2.Request(url)
21 # run the request for each proxy
22 # now set the proxy
23 req.set_proxy(proxy, "http")
24 req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
25 req.add_hea...
Hi,
I'm getting this error:
socket.error: [Errno 54] Connection reset by peer
All I'm trying to do is the following in python:
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
id = response.read()
Some previous related questions suggested using time.sleep to fiddle with the threads. ...
My python application makes many http requests to many urls using urllib2. I would like to build a unit test suite to test my data parsing and error handling code.
I have a directory full of test data, with a number of files, each file containing a single http response, with headers and response data. (using curl -i) In some cases, t...
I want to make a python script that tests the bandwidth of a connection. I am thinking of downloading/uploading a file of a known size using urllib2, and measuring the time it takes to perform this task. I would also like to measure the delay to a given IP address, such as is given by pinging the IP. Is this possible using urllib2?
...
Hi, can anyone point out a tutorial that shows me how to do a POST request using urllib2 with the data being in JSON format?
...
Hi,
I am trying to use urllib2 to open url and to send specific cookie text to the server. E.g. I want to open site Solve chess problems, with a specific cookie, e.g. search=1. How do I do it?
I am trying to do the following:
import urllib2
(need to add cookie to the request somehow)
urllib2.urlopen("http://chess-problems.prg")
Tha...
Here's my problem:
import urllib2
response=urllib2.urlopen('http://proxy-heaven.blogspot.com/')
html=response.read()
print html
It's just this site, and I don't know why the result is all garbled characters. Anyone can help?
...
I have a strange bug when trying to urlopen a certain page from Wikipedia. This is the page:
http://en.wikipedia.org/wiki/OpenCola_(drink)
This is the shell session:
>>> f = urllib2.urlopen('http://en.wikipedia.org/wiki/OpenCola_(drink)')
Traceback (most recent call last):
File "C:\Program Files\Wing IDE 4.0\src\debug\tserver\_sandb...
I'm aware that urllib2 is available on Google App Engine as a wrapper of Urlfetch and, as you know, Universal Feedparser uses urllib2.
Do you know any method to set a timeout on urllib2 ?
Is timeout parameter on urllib2 been ported on Google App Engine version?
I'm not interested in method like:
rssurldata = urlfetch(rssurl, deadline=...
Hello,
I want to access a web page with urllib2 and I keep getting an HTTP Error 401: Unauthorized.
Now, my problem is that this page doesn't need any authentication when using browsers like Firefox. Only when I use Google Chrome an authentication dialog pops up. Though this happens only after the page is fully loaded. So I can just ca...
I am using urllib2 in Python to post login data to a web site.
After successful login, the site redirects my request to another page. Can someone provide a simple code sample on how to do this in Python with urllib2? I guess I will need cookies also to be logged in when I get redirected to another page. Right?
Thanks a lot in advace.
...
I'm using a web service backend to provide authentication to Django, and the get_user method must retain a cookie provided by the web service in order to associate with a session. Right now, I make my remote calls just by calling urllib2.urlopen(myTargetService) but this doesn't pass the cookie for the current session along.
I have crea...
Just started with python not long ago, and I'm learning to use "post" method to communicate directly with a server. A fun script I'm working on right now is to post comments on wordpress. The script does post comments on my local site, but I don't know why it raises HTTP Error 404 which means page not found. Here's my code, please help m...
Hello,
With this code, urllib2 make a GET request:
#!/usr/bin/python
import urllib2
req = urllib2.Request('http://www.google.fr')
req.add_header('User-Agent', '')
response = urllib2.urlopen(req)
With this one (which is almost the same), a POST request:
#!/usr/bin/python
import urllib2
headers = { 'User-Agent' : '' }
req = urllib2.Re...
Which library/module is the best to use for downloading large 500mb+ files in terms of speed, memory, cpu? I was also contemplating using pycurl.
...
Hi all.
I have a regexp and i want to add output of regexp to my url
for exmaple
url = 'blabla.com'
r = re.findall(r'<p>(.*?</a>))
r output - /any_string/on/any/server/
but a dont know how to make get-request with regexp output
blabla.com/any_string/on/any/server/
...
Hi all,
I've been looking for a way to download an image from a URL, preform some image manipulations (resize) actions on it, and then save it to a django ImageField. Using the two great posts (linked below), I have been able to download and save an image to an ImageField. However, I've been having some trouble manipulating the file ...
Hello,
I want to grab the http status code once it raises a URLError exception:
I tried this but didn't help:
except URLError, e:
logger.warning( 'It seems like the server is down. Code:' + str(e.code) )
...
I am scraping a website that has a Javascript next link that looks like this <a href="javascript:__doPostBack('DataGrid1$ctl14$ctl02','')">2</a> .The page is written in aspx.
Is it possible to call that, to get the information on the next page?
Here is the page, http://www.deantechnology.com/hvca/pg_search/fsn.aspx?catalog_sspid=212&am...
We have two applications that are both running on Google App Engine. App1 makes requests to app2 as an authenticated user. The authentication works by requesting an authentication token from Google ClientLogin that is exchanged for a cookie. The cookie is then used for subsequent requests (as described here). App1 runs the following code...