Musing over a recently asked question, I started to wonder if there is a really simple way to deal with XML documents in Python. A pythonic way, if you will.
Perhaps I can explain best if i give example: let's say the following - which i think is a good example of how XML is (mis)used in web services - is the response i get from http request to http://www.google.com/ig/api?weather=94043
<xml_api_reply version="1">
<weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0" >
<forecast_information>
<city data="Mountain View, CA"/>
<postal_code data="94043"/>
<latitude_e6 data=""/>
<longitude_e6 data=""/>
<forecast_date data="2010-06-23"/>
<current_date_time data="2010-06-24 00:02:54 +0000"/>
<unit_system data="US"/>
</forecast_information>
<current_conditions>
<condition data="Sunny"/>
<temp_f data="68"/>
<temp_c data="20"/>
<humidity data="Humidity: 61%"/>
<icon data="/ig/images/weather/sunny.gif"/>
<wind_condition data="Wind: NW at 19 mph"/>
</current_conditions>
...
<forecast_conditions>
<day_of_week data="Sat"/>
<low data="59"/>
<high data="75"/>
<icon data="/ig/images/weather/partly_cloudy.gif"/>
<condition data="Partly Cloudy"/>
</forecast_conditions>
</weather>
</xml_api_reply>
After loading/parsing such document, i would like to be able to access the information as simple as say
>>> xml['xml_api_reply']['weather']['forecast_information']['city'].data
'Mountain View, CA'
or
>>> xml.xml_api_reply.weather.current_conditions.temp_f['data']
'68'
From what I saw so far, seems that ElementTree
is the closest to what I dream of. But it's not there, there is still some fumbling to do when consuming XML. OTOH, what I am thinking is not that complicated - probably just thin veneer on top of a parser - and yet it can decrease annoyance of dealing with XML. Is there such a magic? (And if not - why?)
PS. Note I have tried BeautifulSoup
already and while I like its approach, it has real issues with empty <element/>
s - see below in comments for examples.