elementtree

ElementTree XPath - Select Element based on attribute.

I am having trouble using the attribute XPath Selector in ElementTree, which I should be able to do according to the Documentation Here's some sample code XML <root> <target name="1"> <a></a> <b></b> </target> <target name="2"> <a></a> <b></b> </target> </root> Python def parse(document): root = et.parse(doc...

Does Python 2.5 include a package to natively transform an XML document?

In my Python app, I have an XML document that I'd like to transform using my XSL file. I'm currently using xml.etree to generate the XML document, but I haven't found anything within Python 2.5 that will allow me to natively transform my XML document. I've already found one library (libxslt) which can execute the transformation, but I ...

Comparing XML in a unit test in Python

I have an object that can build itself from an XML string, and write itself out to an XML string. I'd like to write a unit test to test round tripping through XML, but I'm having trouble comparing the two XML versions. Whitespace and attribute order seem to be the issues. Any suggestions for how to do this? This is in Python, and I'm usi...

How do I get the whole text of an element using ElementTree?

That is, all text and subtags, without the tag of an element itself? Having <p>blah <b>bleh</b> blih</p> I want blah <b>bleh</b> blih element.text returns "blah " and etree.tostring(element) returns: <p>blah <b>bleh</b> blih</p> ...

Finding top-level xml comments using Python's ElementTree

I'm parsing an xml file using Python's ElementTree, like that: et = ElementTree(file=file("test.xml")) test.xml starts with a few lines of xml comments. Is there a way to get those comments from et? ...

Python XML - build flat record from dynamic nested "node" elements

Hey guys, I need to parse an XML file and build a record-based output from the data. The problem is that the XML is in a "generic" form, in that it has several levels of nested "node" elements that represent some sort of data structure. I need to build the records dynamically based on the deepest level of the "node" element. Some exa...

Creating a doctype with lxml's etree

I want to add doctypes to my XML documents that I'm generating with LXML's etree. However I cannot figure out how to add a doctype. Hardcoding and concating the string is not an option. I was expecting something along the lines of how PI's are added in etree: pi = etree.PI(...) doc.addprevious(pi) But it's not working for me. How ...

Finding the parent tag of a text string with ElementTree/lxml

I'm trying to take a string of text, and "extract" the rest of the text in the paragraph/document from the html. My current is approach is trying to find the "parent tag" of the string in the html that has been parsed with lxml. (if you know of a better way to tackle this problem, I'm all ears!) For example, search the tree for "TEXT S...

Using SimpleXMLTreeBuilder in elementtree

I have been developing an application with django and elementtree and while deploying it to the production server i have found out it is running python 2.4. I have been able to bundle elementtree but now i am getting the error: "No module named expat; use SimpleXMLTreeBuilder instead" Unfortunately i cannot upgrade python so im stuck ...

How to create "virtual root" with Python's ElementTree?

I am trying to use Python's ElementTree to generate an XHTML file. However, the ElementTree.Element() just lets me create a single tag (e.g., HTML). I need to create some sort of a virtual root or whatever it is called so that I can put the various , DOCTYPES, etc. How do I do that? Thanks ...

HTML inside node using ElementTree

Hi all! I am using ElementTree to parse a XML file. In some fields, there will be HTML data. For example, consider a declaration as follows: <Course> <Description>Line 1<br />Line 2</Description> </Course> Now, supposing _course is an Element variable which hold this Couse element. I want to access this course's description, so I ...

Alternative XML parser for ElementTree to ease UTF-8 woes?

I am parsing some XML with the elementtree.parse() function. It works, except for some utf-8 characters(single byte character above 128). I see that the default parser is XMLTreeBuilder which is based on expat. Is there an alternative parser that I can use that may be less strict and allow utf-8 characters? This is the error I'm gett...

Alter namespace prefixing with ElementTree in Python

By default, when you call ElementTree.parse(someXMLfile) the Python ElementTree library prefixes every parsed node with it's namespace URI in Clark's Notation: {http://example.org/namespace/spec}mynode This makes accessing specific nodes by name a huge pain later in the code. I've read through the docs on ElementTree and namespa...

How to classify users into different countries, based on the Location field

Most web applications have a Location field, in which uses may enter a Location of their choice. How would you classify users into different countries, based on the location entered. For eg, I used the Stackoverflow dump of users.xml and extracted users' names, reputation and location: ['Jeff Atwood', '12853', 'El Cerrito, CA'] ['Jarr...

Need Help using XPath in ElementTree

I am having a heck of a time using ElementTree 1.3 in Python. Essentially, ElementTree does absolutely nothing. My XML file looks like the following: <?xml version="1.0"?> <ItemSearchResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2008-08-19"&gt; <Items> <Item> <ItemAttributes> <ListPrice> ...

Error importing a python module in Django

In my Django project, the following line throws an ImportError: "No module named elementtree". from elementtree import ElementTree However, the module is installed (ie, I can run an interactive python shell, and type that exact line without any ImportError), and the directory containing the module is on the PYTHONPATH. But when I a...

ElementTree in Python 2.6.2 Processing Instructions support?

I'm trying to create XML using the ElementTree object structure in python. It all works very well except when it comes to processing instructions. I can create a PI easily using the factory function ProcessingInstruction(), but it doesn't get added into the elementtree. I can add it manually, but I can't figure out how to add it above...

Parsing XML in Python using ElementTree example

Hello, I'm having a hard time finding a good, basic example of how to parse XML in python using Element Tree. From what I can find, this appears to be the easiest library to use for parsing XML. Here is a sample of the XML I'm working with: <timeSeriesResponse> <queryInfo> <locationParam>01474500</locationParam> <va...

How to extract the first hit elements from an XML NCBI BLAST file?

Hello all, Im trying to extract only the first hit from an NCBI xml BLAST file. next I would like to get only the first HSP. at the final stage I would like to get these based on best score. to make things clear here a sample of the xml file: <?xml version="1.0"?> <!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "http://www.n...

Accessing XMLNS attribute with Python Elementree?

How can one access NS attributes through using ElementTree? With the following: <data xmlns="http://www.foo.net/a" xmlns:a="http://www.foo.net/a" book="1" category="ABS" date="2009-12-22"> When I try to root.get('xmlns') I get back None, Category and Date are fine, Any help appreciated.. ...