views:

589

answers:

1

Hello

I am using the cElementTree library to parse XML files in Python. Everything is working fine

But I would like to provide full error messages for the user when a value in the XML is not correct.

For example, let's suppose I have the following XML:

<A name="xxxx" href="yyyy"/>

and want to tell the user if the href attribute doesn't exist or have a value that is not in a given list.

For the moment, I have something like

if elem.get("ref") not in myList:
    raise XMLException( elem, "the 'href' attribute is not valid or does not exist")

where my exception is caught somewhere.

But, in addition, I would like to display the line number of the XML element in the file. It seems that the cElementTree doesn't store any information about the line numbers of the XML elements of the tree... :-(

Question: Is there an equivalent XML library that is able to do that? Or a way to have access to the position of an XML element in the XML file ?

Thanks

+3  A: 

The equivalent library that you should be using is lxml. lxml is a wrapper on very fast c libraries libxml2 and libxslt and is generally considered superior to the built in ones.

It, luckly, tries to keep to the element tree api and extend it in lxml.etree.

lxml.etree has an attribute sourceline for all elements which is just what you are after.

So elem.sourceline above in the error message should work.

David Raznick
Ok, Thank you for the answer. lxml works fine and the element have an sourceline attribute.BUT on my old machine, lxml is relatively slow compared to cElementTree (from 25% to 50% slower, depending on the input file)
ThibThib
http://codespeak.net/lxml/performance.html. It is slower at loading, parsing then cElementTree but quicker at tree traversal and serialization.
David Raznick