views:

106

answers:

1
+1  Q: 

python parsing xml

hi i have xml file whitch i want to parse, it looks something like this

<?xml version="1.0" encoding="utf-8"?>
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl"&gt;
    <SHOPITEM>
        <ID>2332</ID>
        ...
    </SHOPITEM>
    <SHOPITEM>
        <ID>4433</ID>
        ...
    </SHOPITEM>
</SHOP>

my parsing code is

from lxml import etree

ifile = open('sample-file.xml', 'r')
file_data = etree.parse(ifile)

for item in file_data.iter('SHOPITEM'):
   print item

but item is print only when xml container

<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl"&gt;

looks like

<SHOP>

how can i parse xml document without worrying about this container definition?

+1  A: 

See here for an explanation of how lxml.etree handles namespaces. In general, you should work with them rather than try to avoid them. In this case, write:

for item in file_data.iter('{http://www.w3.org/1999/xhtml}SHOPITEM'):

If you need to refer the namespace frequently, setup a local variable:

xhtml_ns = '{http://www.w3.org/1999/xhtml}'
...
for item in file_data.iter(xhtml_ns + 'SHOPITEM'):
Marcelo Cantos