tags:

views:

29

answers:

1

Hi.

I have the following XML file:

<book>
 <bookname child="test">
  <text> Works </text>
  <text> Doesn't work </text>
 </bookname>
</book>

This is just a one block, there are more than one <bookname> tags. I need to iterate through the whole document and remove specific <text> tags. How do I do that?

My approach is to create an ElementTree first and then get an Element instance using ElementTree.getroot(). Then I use Element.clear(). Is this approach ok? I had want to use Element.remove() but I can't get it to work. Can anyone provide me with a sample syntax.

Thank you for the help!

+1  A: 

Just call parentNode.remove(childNode). Something like this:

>>> etree.tostring(tree)
'<book> <bookname child="test">  <text> Works </text>  <text> Doesnt work </text>    </bookname></book>'
>>> bookname=tree[0]
>>> text2=bookname[1]
>>> bookname.remove(text2)
>>> etree.tostring(tree)
'<book> <bookname child="test">  <text> Works </text>  </bookname></book>'
>>>

Here I take the bookname node and ask it to remove it's second child.

For finding the nodes you want to remove, I'd use xpath

unbeli
How do I compare text in XPath?
sukhbir
text()='whatever', more at http://www.w3.org/TR/xpath/
unbeli
Ok thanks. Let me try it out.
sukhbir
Lxml is wonderfully elegant - I've always loved how pythonic its usage seems to be. This is the answer I came here to give.
nearlymonolith