ansaurus

Question

How to get whole text of an Element in xml.minidom?

Answer 1

+1 A:

Strictly speaking:

from xml.dom.minidom import parse, parseString
tree = parseString("<div id='asd'><pre>skdsk</pre></div>")
root = tree.firstChild
node = root.childNodes[0]
print node.toxml()

In practice, though, I'd recommend looking at the http://www.crummy.com/software/BeautifulSoup/ library. Finding the right childNode in an xhtml document, and skipping "whitespace nodes" is a pain. BeautifulSoup is a robust html/xhtml parser with fantastic tree-search capacilities.

Edit: The example above compresses the HTML into one string. If you use the HTML as in the question, the line breaks and so-forth will generate "whitespace" nodes, so the node you want won't be at childNodes[0].

Jarret Hardie 2009-03-20 15:54:51

ansaurus

tags:

views:

answers:

How to get whole text of an Element in xml.minidom?

related questions