tags:

views:

54

answers:

3

I am relatively new (as in a few days) to Python - I am looking for an example that would show me how to post a form to a website (say www.example.com).

I already know how to use Curl. Infact, I have written C+++ code that does exactly the same thing (i.e. POST a form using Curl), but I would like some starting point (a few lines from which I can build on), which will show me how to do this using Python.

A: 
curl -d "birthyear=1990&press=AUD" www.site.com/register/user.php

http://curl.haxx.se/docs/httpscripting.html

Gerard Banasig
woot!. Is it really as simple as that?
morpheous
+1  A: 

Here is an example using urllib and urllib2 for both POST and GET:

POST - If urlopen() has a second parameter then it is a POST request.

import urllib
import urllib2

url = 'http://www.example.com'
values = {'var' : 500}

data = urllib.urlencode(values)
response = urllib2.urlopen(url, data)
page = response.read()

GET - If urlopen() has a single parameter then it is a GET request.

import urllib
import urllib2

url = 'http://www.example.com'
values = {'var' : 500}

data = urllib.urlencode(values)
fullurl = url + '?' + data
response = urllib2.urlopen(fullurl)
page = response.read()

You could also use curl if you call it using os.system().

Here are some helpful links:
http://docs.python.org/library/urllib2.html#urllib2.urlopen
http://docs.python.org/library/os.html#os.system

tdedecko
@tdedecko: +1 for the snippet. Ah, so I do not necessarily need curl. I have 2 questions though. 1). How do you specify the HTTP method i.e. POST instead of GET? 2). Presumably, this is similar code you would use to 'fetch' a page from a url into memory (say before parsing it)?
morpheous
1) I edited the post to include an example of both POST and GET. 2) The response returned from `urlopen()` is a file object of the content returned from the server. You can then parse this content using your favorite parser (BeautifulSoup) or your own methods. Hope this helps
tdedecko
A: 

There are two major Python packages for automating web interactions:

  • Mechanize
  • Twill

    Twill has apparently not been updated for a couple years and seems to have been at version 0.9 since Dec. 2007. Mechanize shows changelog and releases from just a few days ago: 2010-05-16 with the release of version 0.2.1.

    Of course you'll find examples listed in their respective web pages. Twill essentially provides a simple shell like interpreter while Mechanize provides a class and API in which you set form values using Python dictionary-like (__setattr__() method) statements, for example. Both use BeautifulSoup for parsing "real world" (sloppy tag soup) HTML. (This is highly recommended for dealing with HTML that you encounter in the wild, and strongly discouraged for your own HTML which should be written to pass standards conforming, validating, parsers).

Jim Dennis