views:

94

answers:

2

How can I convert xml to Python data structure using lxml?

I have searched high and low but can't find anything.

Input example

<ApplicationPack>
  <name>Mozilla Firefox</name>
  <shortname>firefox</shortname>
  <description>Leading Open Source internet browser.</description>
  <version>3.6.3-1</version>
  <license name="Firefox EULA">http://www.mozilla.com/en-US/legal/eula/firefox-en.html&lt;/license&gt;
  <ms-license>False</ms-license>
  <vendor>Mozilla Foundation</vendor>
  <homepage>http://www.mozilla.org/firefox&lt;/homepage&gt;
  <icon>resources/firefox.png</icon>
  <download>http://download.mozilla.org/?product=firefox-3.6.3&amp;amp;os=win&amp;amp;lang=en-GB&lt;/download&gt;
  <crack required="0"/>
  <install>scripts/install.sh</install>
  <postinstall name="Clean Up"></postinstall>
  <run>C:\\Program Files\\Mozilla Firefox\\firefox.exe</run>
  <uninstall>c:\\Program Files\\Mozilla Firefox\\uninstall\\helper.exe /S</uninstall>
  <requires name="autohotkey" />
</ApplicationPack>
+1  A: 
>>> from lxml import etree
>>> treetop = etree.fromstring(anxmlstring)

converts the xml in the string to a Python data structure, and so does

>>> othertree = etree.parse(somexmlurl)

where somexmlurl is the path to a local XML file or the URL of an XML file on the web.

The Python data structure these functions provide (known as an "element tree", whence the etree module name) is well documented here -- all the classes, functions, methods, etc, that the Python data structure in question supports. It closely matches one supported in the Python standard library, by the way.

If you want some different Python data structure, you'll have to walk through the Python data structure which lxml returns, as above mentioned, and build your different data structure yourself based on the information thus collected; lxml can't specifically help you, except by offering several helpers for finding information in the parsed structure it returns, so that collecting said info is a flexible, easy task (again, see the documentation URL above).

Alex Martelli
A: 

It's not entirely clear what kind of data structure you're looking for, but here's a link to a code sample to convert XML to python dictionary of lists via lxml.etree.

ars