In Python 2.6 using ElementTree, what's a good way to fetch the XML (as a string) inside a particular element, like what you can do in HTML and javascript with innerHTML
?
Here's a simplified sample of the XML node I am starting with:
<label attr="foo" attr2="bar">This is some text <a href="foo.htm">and a link</a> in embedded HTML</label>
I'd like to end up with this string:
This is some text <a href="foo.htm">and a link</a> in embedded HTML
I've tried iterating over the parent node and concatenating the tostring()
of the children, but that gave me only the subnodes:
# returns only subnodes (e.g. <a href="foo.htm">and a link</a>)
''.join([et.tostring(sub, encoding="utf-8") for sub in node])
I can hack up a solution using regular expressions, but was hoping there'd be something less hacky than this:
re.sub("</\w+?>\s*?$", "", re.sub("^\s*?<\w*?>", "", et.tostring(node, encoding="utf-8")))