views:

585

answers:

3

From Python, I would like to retrieve content from a web site via HTTPS with basic authentication. I need the content on disk. I am on an intranet, trusting the HTTPS server. Platform is Python 2.6.2 on Windows.

I have been playing around with urllib2, however did not succeed so far.

I have a solution running, calling wget via os.system():

wget_cmd = r'\path\to\wget.exe -q -e "https_proxy = http://fqdn.to.proxy:port" --no-check-certificate --http-user="username" --http-password="password" -O path\to\output https://fqdn.to.site/content'

I would like to get rid of the os.system(). Is that possible in Python?

+2  A: 

Try this (notice that you'll have to fill in the realm of your server also):

import urllib2
authinfo = urllib2.HTTPBasicAuthHandler()
authinfo.add_password(realm='Fill In Realm Here',
                      uri='https://fqdn.to.site/content',
                      user='username',
                      passwd='password')
proxy_support = urllib2.ProxyHandler({"https" : "http://fqdn.to.proxy:port"})
opener = urllib2.build_opener(proxy_support, authinfo)
fp = opener.open("https://fqdn.to.site/content")
open(r"path\to\output", "wb").write(fp.read())
Martin v. Löwis
+2  A: 

Proxy and https wasn't working for a long time with urllib2. It will be fixed in the next released version of python 2.6 (v2.6.3).

In the meantime you can reimplement the correct support, that's what we did for mercurial: http://hg.intevation.org/mercurial/crew/rev/59acb9c7d90f

tonfa
A: 

You could try this too: http://code.google.com/p/python-httpclient/

(It also supports the verification of the server certificate.)

Bruno