I'd like to parse a simple, small XML file using python however work on pyXML seems to have ceased. I'd like to use python 2.6 if possible. Can anyone recommend an XML parser that will work with 2.6?
Thanks
I'd like to parse a simple, small XML file using python however work on pyXML seems to have ceased. I'd like to use python 2.6 if possible. Can anyone recommend an XML parser that will work with 2.6?
Thanks
If it's small and simple then just use the standard library:
from xml.dom.minidom import parse
doc = parse("filename.xml")
This will return a DOM tree implementing the standard Document Object Model API
If you later need to do complex things like schema validation or XPath querying then I recommend the third-party lxml module, which is a wrapper around the popular libxml2 C library.
Would lxml suit your needs? Its the first tool I turn to for xml parsing.
Here is also a very good example on how to use minidom along with explanations.
A few years ago, I wrote a library for working with structured XML. It makes XML simpler by making some limiting assumptions.
You could use XML for something like a word processor document, in which case you have a complicated soup of stuff with XML tags embedded all over the place; in which case my library would not be good.
But if you are using XML for something like a config file, my library is rather convenient. You define classes that describe the structure of the XML you want, and once you have the classes done, there is a method to slurp in XML and parse it. The actual parsing is done by xml.dom.minidom, but then my library extracts the data and puts it in the classes.
The best part: you can declare a "Collection" type that will be a Python list with zero or more other XML elements inside it. This is great for things like Atom or RSS feeds (which was the original reason I designed the library).
Here's the URL: http://home.avvanta.com/~steveha/xe.html
I'd be happy to answer questions if you have any.
For most of my tasks I have used the Minidom Lightweight DOM implementation, from the official page:
from xml.dom.minidom import parse, parseString
dom1 = parse('c:\\temp\\mydata.xml') # parse an XML file by name
datasource = open('c:\\temp\\mydata.xml')
dom2 = parse(datasource) # parse an open file
dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>')