questions about urllib | ansaurus

urllib

Python standard library to POST multipart/form-data encoded data

hello, I would like to POST multipart/form-data encoded data. I have found an external module that does it: http://atlee.ca/software/poster/index.html however I would rather avoid this dependency. Is there a way to do this using the standard libraries? thanks ...

relevent query to how to fetch public key from public key server

import urllib response = urllib.urlopen('http://pool.sks-keyservers.net/') print 'RESPONSE:', response print 'URL :', response.geturl() headers = response.info() print 'DATE :', headers['date'] print 'HEADERS :' print '---------' print headers data = response.read() print 'LENGTH :', len(data) print 'DATA :' print '--------...

How to catch 404 error in urllib.urlretrieve

Background: I am using urllib.urlretrieve, as opposed to any other function in the urllib* modules, because of the hook function support (see reporthook below) .. which is used to display a textual progress bar. This is Python >=2.6. >>> urllib.urlretrieve(url[, filename[, reporthook[, data]]]) However, urlretrieve is so dumb that it ...

Python urllib.urlopen() call doesn't work with a URL that a browser accepts

If I point Firefox at http://bitbucket.org/tortoisehg/stable/wiki/Home/ReleaseNotes, I get a page of HTML. But if I try this in Python: import urllib site = 'http://bitbucket.org/tortoisehg/stable/wiki/Home/ReleaseNotes' req = urllib.urlopen(site) text = req.read() I get the following: 500 Internal Server Error The server encounter...

Django: add image in an ImageField from image url

Hi, please excuse me for my ugly english ;-) Imagine this very simple model : class Photo(models.Model): image = models.ImageField('Label', upload_to='path/') I would like to create a Photo from an image URL (i.e., not by hand in the django admin site). I think that I need to do something like this : from myapp.models import Ph...

django-imagefield

Python urllib, minidom and parsing international characters

When I try to retrive information from google weather api with the followign url, http://www.google.com/ig/api?weather=Munich,Germany&hl=de and then try to parse it with minidom, I get error that the document is not well formed. I use following code sock = urllib.urlopen(url) # above mentioned url doc = minidom.parse(sock) I t...

internationalization

Why I get urllib2.HTTPError with urllib2 and no errors with urllib?

Hi, I have the following simple code: import urllib2 import sys sys.path.append('../BeautifulSoup/BeautifulSoup-3.1.0.1') from BeautifulSoup import * page='http://en.wikipedia.org/wiki/Main_Page' c=urllib2.urlopen(page) This code generates the following error messages: c=urllib2.urlopen(page) File "/usr/lib64/python2.4/urllib2....

How to download any(!) webpage with correct charset in python?

Problem When screen-scraping a webpage using python one has to know the character encoding of the page. If you get the character encoding wrong than your output will be messed up. People usually use some rudimentary technique to detect the encoding. They either use the charset from the header or the charset defined in the meta tag or t...

character-encoding

screen-scraping

should I call close() after urllib.urlopen()?

new to python and reading someone else's code: should urllib.urlopen() be followed by urllib.close()? Otherwise, one would leak connections, correct? ...

How can I bind secondary IP address to urllib2

My server comes with one primary and five additional IP addresses. By default, urllib2 requests are all from primary IP addresses. How can I bind those secondary IP addresses to urllib2 with every reqeust it makes. ...

Trace/BPT trap when calling urllib.urlopen

For some reason I'm getting a Trace/BPT trap error when calling urllib.urlopen. I've tried both urllib and urllib2 with identical results. Here is the code which throws the error: def get_url(url): from urllib2 import urlopen if not url or not url.startswith('http://'): return None return urlopen(url).read() # FIXME! I sho...

urlretrieve returns an empty file

I'm trying to use urlretrieve to download files from urls that take the form: http://example.com/download.php?id=6456&name=foo yet for some reason I just get an empty response. I've tried the method suggested in this question didn't seem to help because remotefile.info() doesn't contain the key 'content-disposition', only ['...

Python MultiThreading With Urllib2 Issue

I can download multiple files quite fast with many threads at once but the problem is that after a few minutes it tends to slow down gradually to almost a full stop, I have no idea why. There's nothing wrong with my code that I can see and my RAM/CPU is fine.. The only thing I can think of is that urllib2 isn't handling the massive amoun...

how to open a URL with non utf-8 arguments

Hello, Using Python I need to transfer non utf-8 encoded data (specifically shift-jis) to a URL via the query string. How should I transfer the data? Quote it? Encode in utf-8? Thanks ...

TypeError: cannot concatenate 'str' and 'instance' objects (python urllib)

Writing a python program, and I came up with this error while using the urllib.urlopen function. Traceback (most recent call last): File "ChurchScraper.py", line 58, in <module> html = GetAllChurchPages() File "ChurchScraper.py", line 48, in GetAllChurchPages CPs = CPs + urllib.urlopen(url) TypeError: cannot concatenate 'str' and 'insta...

parse.unquote_plus TypeError

I'm trying to format a file so that it can be inserted into a database, the file is originally compressed and arround 1.3MB big. Each line looks something like this: 398,%7EAnoniem+001%7E,543,480,7525010,1775,0 This is how the code looks like that parses this file: Village = gzip.open(Root+'\\data'+'\\' +str(Newest_Date[0])+'\\...

How to percent-encode url parameters in python?

If I do url = "http://example.com?p=" + urllib.quote(query) It doens't encode "/" to "%2F" (breaks OAuth normalization) It doens't handle unicode (it throw an exception) is there a better library? ...

In Python, how do I use urllib to see if a website is 404 or 200?

How to get the code of the headers through urllib? ...

http-status-codes

How to ignore "Enter username for Private Proxy Access" prompt?

Hello, I'm using urllib.urlopen with some http proxies and sometimes (probably when they require authorization) I get the following prompt printed into the console: Enter username for Private Proxy Access (country) at xxx.xxx.xxx.xxx:xxxx How can I raise an exception on such thing happening? Here's the example: from urllib import u...

Python won't refresh URL to receive new forex ticker data

Hello, I am trying to save updated Forex ticker data from this website: http://forex.offers4u.biz/TickDBReadDB.php?p=EURUSD just hit refresh to update the ticker. when I use my little python script, it saves the text once, but if i run it again, it makes a new file with the same old data. How can I add a "cachebreaker" so that python...

1
2
3
4
5
...
7