questions about urllib2 | ansaurus

urllib2

Python GUI Scraper hanging issues.

I wrote a scraper using python a while back, and it worked fine in the command line. I have made a GUI for the application now, but I am having trouble with one issue. When I attempt to update text inside the gui (e.g. 'fetching URL 12/50'), I am unable seeing as the function within the scraper is grabbing 100+ links. Also when going ...

screen-scraping

making urllib request in Python from the client side

Hi Guys, I've written a Python application that makes web requests using the urllib2 library after which it scrapes the data. I could deploy this as a web application which means all urllib2 requests go through my web-server. This leads to the danger of the server's IP being banned due to the high number of web requests for many users. ...

Python3 error: "Import error: No module name urllib"

Here's my code: from urllib.request import urlopen response = urllib.urlopen("http://www.google.com") html = response.read() print(html) Any help? ...

Python download without supplying a filename

How do I download a file with progress report using python but without supplying a filename. I have tried urllib.urlretrieve but I seem to have to supply a filename for the downloaded file to save as. So for example: I don't want to supply this: urllib.urlretrieve("http://www.mozilla.com/products/download.html?product=firefox-3.6.3&a...

Python and urllib2: how to make a GET request with parameters

I'm building an "API API", it's basically a wrapper for a in house REST web service that the web app will be making a lot of requests to. Some of the web service calls need to be GET rather than post, but passing parameters. Is there a "best practice" way to encode a dictionary into a query string? e.g.: ?foo=bar&bla=blah I'm looking ...

In Python, urllib2 giving error

I tried running this, >>> urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl') But it is giving error like this, can anyone tell me a solution ? Traceback (most recent call last): File "<pyshell#11>", line 1, in <module> urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl') File "C:\Python26\lib\urllib2.py"...

deploying a war to tomcat using python

Hi, I'm trying to deploy a war to a Apache Tomcat server (Build 6.0.24) using python (2.4.2) as part of a build process. I'm using the following code import urllib2 import base64 war_file_contents = open('war_file.war','rb').read() username='some_user' password='some_pwd' base64string = base64.encodestring('%s:%s' % (username, pas...

Does Google appengine cache external requests?

I have a very simple application running on appengine that requests a web page every five minutes and parses for a specific piece of data. Everything works fine except that the response I get back from the external request (using urllib2) doesn't reflect the latest changes to the page. Sometimes it takes a few minutes to get the latest,...

google-app-engine

how to encode a url with urllib or urllib2

I want a url like example.com/page.html to somthing like example.com/a$xDzf9D84qGBOeXkXNstw%3D%3D106 ...

Does urllib or urllib2 in Python 2.5 support https?

Thanks for the help in advance. I am puzzled that the same code works for python 2.6 but not 2.5. Here is the code import cgi, urllib, urlparse, urllib2 url='https://graph.facebook.com' req=urllib2.Request(url=url) p=urllib2.urlopen(req) response = cgi.parse_qs(p.read()) And here is the exception I got Traceback (most recent call l...

python urllib post question

hello ALL im making some simple python post script but it not working well. there is 2 part to have to login. first login is using 'http://mybuddy.buddybuddy.co.kr/userinfo/UserInfo.asp' this one. and second login is using 'http://user.buddybuddy.co.kr/usercheck/UserCheckPWExec.asp' i can login first login page, but i couldn't login...

Does httplib2 support http proxy at all? Socks proxy works but not http.

Here is my code. I cannot get any http proxy to work. Socks proxy (socks4/5) works fine though. Any ideas why? urllib2 works fine with proxies though. I am confused. Thanks.. Code : 1 import socks 2 import httplib2 3 import BeautifulSoup 4 5 httplib2.debuglevel=4 6 7 http = httplib2.Http(proxy_info = httplib2.ProxyInfo(...

screen-scraping

Python 2.6 -> Python 3 (ProxyHandler)

Hallo, I wrote a script that works with a proxy (py2.6x): proxy_support = urllib2.ProxyHandler({'http' : 'http://127.0.0.1:80'}) But in py3.11x there is no urllib2 just a urllib... and that doesnt support the ProxyHandler How can I use a proxy with urllib? Isnt Python 3 newer then Python 2? Why did they remove urllib2 in a newer vers...

Python urllib2 multiple try statement on urlopen()

So, simply I want to be able to run a for across a list of URLs, if one fails then I want to continue on to try the next. I've tried using the following but sadly it throws and exception if the first URL doesn't work. servers = ('http://www.google.com', 'http://www.stackoverflow.com') for server in servers: try: u = urllib2...

Python urllib2 > HTTP Proxy > HTTPS request

This work fine: import urllib2 opener = urllib2.build_opener( urllib2.HTTPHandler(), urllib2.HTTPSHandler(), urllib2.ProxyHandler({'http': 'http://user:pass@proxy:3128'})) urllib2.install_opener(opener) print urllib2.urlopen('http://www.google.com').read() But, if http change to https: ...

Python form POST using urllib2 (also question on saving/using cookies)

I am trying to write a function to post form data and save returned cookie info in a file so that the next time the page is visited, the cookie information is sent to the server (i.e. normal browser behavior). I wrote this relatively easily in C++ using curlib, but have spent almost an entire day trying to write this in Python, using ur...

Help converting code using httlib2 to use urllib2

What am I trying to do? Visit a site, retrieve cookie, visit the next page by sending in the cookie info. It all works but httplib2 is giving me one too many problems with socks proxy on one site. http = httplib2.Http() main_url = 'http://mywebsite.com/get.aspx?id='+ id +'&rows=25' response, content = http.request(main_url, 'GET', hea...

screen-scraping

Does urllib2.urlopen() actually fetch the page?

hi all, I was condering when I use urllib2.urlopen() does it just to header reads or does it actually bring back the entire webpage? IE does the HTML page actually get fetch on the urlopen call or the read() call? handle = urllib2.urlopen(url) html = handle.read() The reason I ask is for this workflow... I have a list of urls (some...

Urllib's urlopen breaking on some sites (e.g. StackApps api): returns garbage results

I'm using urllib2's urlopen function to try and get a JSON result from the StackOverflow api. The code I'm using: >>> import urllib2 >>> conn = urllib2.urlopen("http://api.stackoverflow.com/0.8/users/") >>> conn.readline() The result I'm getting: '\x1f\x8b\x08\x00\x00\x00\x00\x00\x04\x00\xed\xbd\x07`\x1cI\x96%&/m\xca{\x7fJ\... I'm...

Storing cookielib cookies in a database

Hi, I'm using the cookielib module to handle HTTP cookies when using the urllib2 module in Python 2.6 in a way similar to this snippet: import cookielib, urllib2 cj = cookielib.CookieJar() opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) r = opener.open("http://example.com/") I'd like to store the cookies in a database....

1
...
6
7
8
9
10
...
13