ansaurus

Question

Answer 1

A:

For logging in and browsing, use mechanize, and for extracting data, use BeautifulSoup.

leoluk 2010-09-20 16:13:39

BeautifulSoup, its too complicated

2010-09-20 16:28:07

Answer 2

+2 A:

What you describe is a relatively small task in Python and I would guess there are not much tutorials about it.

Basically, to retrieve a document (no matter if XML or CSV) from a website you can use urllib2:

import urllib2
data = urllib2.urlopen("http://www.example.org/something.xml").read()

To work with XML you could use ElementTree:

import xml.etree.ElementTree as ElementTree
rootelem = ElementTree.fromstring(data)

Now you can inspect the XML tree with the ElementTree API. See the documentation for further information.

To work with CSV you could use the csv module:

import csv
csvreader = csv.reader([data])

You can read the values in a simple for loop. Again see the documentation.

If this doesn't answer your question, please describe in more detail what you would like to achieve.

Noya 2010-09-20 21:32:16

Your altruism is legendary. Well done.

jathanism 2010-09-20 22:51:29

Good tutorial on XML/CSV download from websites