views:

161

answers:

2
from xml.dom.minidom import parseString
dom = parseString(data)
data = dom.getElementsByTagName('data')

the 'data' variable returns as an element object but I cant for the life of me see in the documentation to grab the text value of the element.

For example:

<something><data>I WANT THIS</data></something>

Anyone have any ideas?

+1  A: 

This should do the trick:

dom = parseString('<something><data>I WANT THIS</data></something>')
data = dom.getElementsByTagName('data')[0].childNodes[0].data

i.e. you need to wade deeper into the DOM structure to get at the text child node and then access its value.

Andy
Note that in the case of an empty string there will be no child Text Node so childNodes[0] will fail.
bobince
To collect text data properly one have to traverse through childNodes and concatenate data from all node where node.nodeType is either TEXT_NODE or CDATA_SECTION_NODE. ElementTree interface is simplier.
Denis Otkidach
+2  A: 

So the way to look at it is that "I WANT THIS" is actually another node. It's a text child of "data".

from xml.dom.minidom import parseString
dom = parseString(data)
nodes = dom.getElementsByTagName('data')

At this point, "nodes" is a NodeList and in your example, it has one item in it which is the "data" element. Correspondingly the "data" element also only has one child which is a text node "I WANT THIS".

So you could just do something like this:

print nodes[0].firstChild.nodeValue

Note that in the case where you have more than one tag called "data" in your input, you should use some sort of iteration technique on "nodes" rather than index it directly.

Brent Nash