python parsing xml | ansaurus

tags:

views:

106

answers:

1

+1 Q:

python parsing xml

hi i have xml file whitch i want to parse, it looks something like this

<?xml version="1.0" encoding="utf-8"?>
<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl"&gt;
    <SHOPITEM>
        <ID>2332</ID>
        ...
    </SHOPITEM>
    <SHOPITEM>
        <ID>4433</ID>
        ...
    </SHOPITEM>
</SHOP>

my parsing code is

from lxml import etree

ifile = open('sample-file.xml', 'r')
file_data = etree.parse(ifile)

for item in file_data.iter('SHOPITEM'):
   print item

but item is print only when xml container

<SHOP xmlns="http://www.w3.org/1999/xhtml" xmlns:php="http://php.net/xsl"&gt;

looks like

<SHOP>

how can i parse xml document without worrying about this container definition?

+1 A:

See here for an explanation of how lxml.etree handles namespaces. In general, you should work with them rather than try to avoid them. In this case, write:

for item in file_data.iter('{http://www.w3.org/1999/xhtml}SHOPITEM'):

If you need to refer the namespace frequently, setup a local variable:

xhtml_ns = '{http://www.w3.org/1999/xhtml}'
...
for item in file_data.iter(xhtml_ns + 'SHOPITEM'):

Marcelo Cantos 2010-09-03 09:27:40

related questions

Load an XmlNodeList into an XmlDocument without looping?

Does System.Xml use MSXML?

Using an XML catalog with Python's lxml?

Why Are People Still Creating RSS Feeds?

Pretty printing XML files on Emacs

Application configuration files

What is the best XML editor?

How much extra overhead is generated when sending a file over a web service as a byte array?

XPATHS and Default Namespaces

How to parse XML in VBA

Small modification to an XML document using StAX

how to use xpath in python

Best binary XML format for JavaME

How can I split an XML document into thirds (or, even better, n pieces)?

Test serialization encoding

Is it "bad practice" to be sensitive to linebreaks in XML documents?

HTML comments break down

Authoritative source on XML-sig

Best way to get InnerXml of an XElement?

HTML version choice

SQL 2005 For XML Explicit - Need help formatting

Any experiences with Protocol Buffers?

XML Editing/Viewing Software

XML Processing in Python

Converting CSV File to XML in Java