views:

29

answers:

3

I'm looking to parse a xml file using Python and I was wondering if there was any way of automating the task over manually walking through all xml nodes/attributes using xml.dom.minidom library.

Essentially what would be sweet is if I could load a xml schema for the xml file I am reading then have that automatically generate some kind of data struct/set with all of the data within the xml.

In C# land this is possible via creating a strongly typed dataset class from a xml schema and then using this dataset to read the xml file in.

Is there any equivalent in Python?

A: 

hey dude - take beautifulSoup - it is a super library. HEAD over to the site scraperwiki.com

the can help you!

Webworker
I originally considered beautifulSoup, but I never found a way of making it do what I mentioned above.
Nick
A: 

You might take a look at lxml.objectify, particularly the E-factory. It's not really an equivalent to the ADO tools, but you may find it useful nonetheless.

Robert Rossney
+2  A: 

lxml is a super-robust xml parsing package. It includes a subpackage, lxml.objectify, that will make an object tree from your xml.

It doesn't generate a class from the schema -- that's probably more a C#/Java thing -- but it does do schema validation so you know what kind of object you're getting back (see "asserting a schema").

ianmclaury
lxml.objectify is pretty close to what I was going for. Thanks!
Nick