views:

972

answers:

3

I am having a heck of a time using ElementTree 1.3 in Python. Essentially, ElementTree does absolutely nothing.

My XML file looks like the following:

<?xml version="1.0"?>
<ItemSearchResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2008-08-19"&gt;
  <Items>
    <Item>
      <ItemAttributes>
        <ListPrice>
          <Amount>2260</Amount>
        </ListPrice>
      </ItemAttributes>
      <Offers>
        <Offer>
          <OfferListing>
            <Price>
              <Amount>1853</Amount>
            </Price>
          </OfferListing>
        </Offer>
      </Offers>
    </Item>
  </Items>
</ItemSearchResponse>

All I want to do is extract the ListPrice.

This is the code I am using...

from elementtree import ElementTree as ET

>> fp = open("output.xml","r")
>> element = ET.parse(fp).getroot()
>> e = element.findall('ItemSearchResponse/Items/Item/ItemAttributes/ListPrice/Amount')
>> for i in e:
>>    print i.text
>>
>> e
>>

Absolutely no output. I also tried

>> e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount')

No difference.

What am I doing wrong?

+3  A: 

There are 2 problems that you have.

1) element contains only the root element, not recursively the whole document. It is of type Element not ElementTree.

2) Your search string needs to use namespaces if you keep the namespace in the XML.

To fix problem #1:

You need to change:

element = ET.parse(fp).getroot()

to:

element = ET.parse(fp)

To fix problem #2:

You can take off the xmlns from the XML document so it looks like this:

<?xml version="1.0"?>
<ItemSearchResponse>
  <Items>
    <Item>
      <ItemAttributes>
        <ListPrice>
          <Amount>2260</Amount>
        </ListPrice>
      </ItemAttributes>
      <Offers>
        <Offer>
          <OfferListing>
            <Price>
              <Amount>1853</Amount>
            </Price>
          </OfferListing>
        </Offer>
      </Offers>
    </Item>
  </Items>
</ItemSearchResponse>

With this document you can use the following search string:

e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount')

The full code:

from elementtree import ElementTree as ET
fp = open("output.xml","r")
element = ET.parse(fp)
e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount')
for i in e:
  print i.text

Alternate fix to problem #2:

Otherwise you need to specify the xmlns inside the srearch string for each element.

The full code:

from elementtree import ElementTree as ET
fp = open("output.xml","r")
element = ET.parse(fp)

namespace = "{http://webservices.amazon.com/AWSECommerceService/2008-08-19}"
e = element.findall('{0}Items/{0}Item/{0}ItemAttributes/{0}ListPrice/{0}Amount'.format(namespace))
for i in e:
    print i.text


Both print:

2260

Brian R. Bondy
Thank you so much. Was about to bang my head against a wall repeatedly.
Ryan Rosario
No problem, they should give an example with namespaces in their documentation for find and findall.
Brian R. Bondy
+1  A: 

Element tree uses namespaces so all the elements in your xml have name like {http://webservices.amazon.com/AWSECommerceService/2008-08-19}Items

So make the search include the namespace e.g.

search = '{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Items/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Item/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}ItemAttributes/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}ListPrice/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Amount'
element.findall( search )

gives the element corresponding to 2260

Mark
I think you mean: 2260
Brian R. Bondy
Yes - lazyness I just saw python same element Amounty and the address I did not do the bit extra and see what teext the Element had
Mark
+2  A: 
from xml.etree import ElementTree as ET
tree = ET.parse("output.xml")
namespace = tree.getroot().tag[1:].split("}")[0]
amount = tree.find(".//{%s}Amount" % namespace).text

Also, consider using lxml. It's way faster.

from lxml import ElementTree as ET
Gonsalu