views:

543

answers:

2

I have a web page containing a login form which loads via HTTP, but it submits the data via HTTPS.

I'm using python-mechanize to log into this site, but it seems that the data is submitted via HTTP.

My code is looks like this:

import mechanize
b = mechanize.Browser()
b.open('http://site.com')
form = b.forms().next()  # the login form is unnamed...
print form.action        # prints "https://login.us.site.com"
form['user'] = "guest"
form['pass'] = "guest"
b.form = form
b.submit()

When the form is submitted, the connection is made via HTTP and contains something like:

send: 'POST https://login.us.site.com/ HTTP/1.1\r\nAccept-Encoding: identity\r\nContent-Length: 180\r\nHost: login.us.site.com\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\n'...

Can anyone confirm this and eventually post a solution so that the form is submitted via HTTPS?

Later edit:

1) I'm using a HTTP proxy for http/https traffic (set in the environment - Linux machine)
2) I've watched the traffic with Wireshark and I can confirm that the traffic is sent via normal HTTP (I can see the content of the POST and mechanize doesn't send the same requests to the proxy as a webbrowser - the latter sends CONNECT login.us.site.com:443, while mechanize only POSTs https://login.us.site.com). However, I don't know what happens to the data as it leaves the proxy; perhaps it establishes a ssl connection to the target site?

+1  A: 

mechanize uses urllib2 internally and the later had a bug: HTTPS over (Squid) Proxy fails. The bug is fixed in Python 2.6.3, so updating Python should solve your problem.

Denis Otkidach
While that bug indeed looks related, i doubt it's urllib2's fault, first because i run python 2.6.4 (up-to-date in my Ubuntu distribution) and second because I wrote a test program requesting https://www.paypal.com which indeed sends CONNECT through the proxy.So it seems to be an issue related to mechanize..