ansaurus

Question

Answer 1

+21 A:

The lxml package supports xpath. It seems to work pretty well, although I've had some trouble with the self:: axis. There's also Amara, but I haven't used it personally.

James Sulak 2008-08-12 11:40:13

lxml definitely makes easy xml work easy with python.

Jweede 2009-07-17 17:07:09

amara's pretty nice, and one doesn't always need xpath.

gatoatigrado 2009-12-05 05:35:47

Answer 2

+2 A:

PyXML works well.

You didn't say what platform you're using, however if you're on Ubuntu you can get it with sudo apt-get install python-xml. I'm sure other Linux distros have it as well.

If you're on a Mac, xpath is already installed but not immediately accessible. You can set PY_USE_XMLPLUS in your environment or do it the Python way before you import xml.xpath:

if sys.platform.startswith('darwin'):
    os.environ['PY_USE_XMLPLUS'] = '1'

In the worst case you may have to build it yourself. This package is no longer maintained but still builds fine and works with modern 2.x Pythons. Basic docs are here.

David Joyner 2008-08-12 19:34:44

Answer 3

+3 A:

The latest version of elementtree supports XPath pretty well. Not being an XPath expert I can't say for sure if the implementation is full but it has satisfied most of my needs when working in Python. I've also use lxml and PyXML and I find etree nice because it's a standard module.

NOTE: I've since found lxml and for me it's definitely the best XML lib out there for Python. It does XPath nicely as well (though again perhaps not a full implementation).

jkp 2008-08-14 09:48:59

ElementTree's XPath support is currently minimal at best. There are huge gaping holes in functionality, such as the lack of attribute selectors, no non-default axes, no child indexing, etc.Version 1.3 (in alpha) adds some of these features, but it's still an unashamedly partial implementation.

Alabaster Codify 2009-01-09 23:26:59

Answer 4

+28 A:

libxml2 has a number of advantages:

Compliance to the spec
Active development and a community participation
Speed. This is really a python wrapper around a C implementation.
Ubiquity. The libxml2 library is pervasive and thus well tested.

Downsides include:

Compliance to the spec. It's strict. Things like default namespace handling are easier in other libraries.
Use of native code. This can be a pain depending on your how your application is distributed / deployed. RPMs are available that ease some of this pain.
Manual resource handling. Note in the sample below the calls to freeDoc() and xpathFreeContext(). This is not very Pythonic.

If you are doing simple path selection, stick with ElementTree ( which is included in Python 2.5 ). If you need full spec compliance or raw speed and can cope with the distribution of native code, go with libxml2.

Sample of libxml2 XPath Use


import libxml2

doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//*")
if len(res) != 2:
    print "xpath query: wrong node set size"
    sys.exit(1)
if res[0].name != "doc" or res[1].name != "foo":
    print "xpath query: wrong node set value"
    sys.exit(1)
doc.freeDoc()
ctxt.xpathFreeContext()

Sample of ElementTree XPath Use



from elementtree.ElementTree import ElementTree
doc = ElementTree(file='tst.xml')
for e in mydata.findall('/foo/bar'):
    print e.get('title').text

Ryan Cox 2008-08-26 13:06:39

I've since found lxml and I think it beats all other Python XML libs out there hands down.

jkp 2009-01-12 22:52:48

+1 for jkp, lxml is far better

flybywire 2010-02-01 13:01:07

Answer 5

+6 A:

Use LXML. LXML uses the full power of libxml2 and libxslt, but wraps them in more "Pythonic" bindings than the Python bindings that are native to those libraries. As such, it gets the full XPath 1.0 implementation. Native ElemenTree supports a limited subset of XPath, although it may be good enough for your needs.

2009-11-13 23:11:17

Answer 6

+3 A:

Another option is py-dom-xpath http://code.google.com/p/py-dom-xpath/, it works seamlessly with minidom and is pure Python so works on appengine.

import xpath
xpath.find('//item', doc)

Sam 2010-01-23 09:30:19

Answer 7

A:

Another library is 4Suite: http://sourceforge.net/projects/foursuite/

I do not know how spec-compliant it is. But it has worked very well for my use. It looks abandoned.

codeape 2010-08-23 12:57:47

Answer 8

A:

You can use:

PyXML:

from xml.dom.ext.reader import Sax2
from xml import xpath
doc = Sax2.FromXmlFile('foo.xml').documentElement
for url in xpath.Evaluate('//@Url', doc):
  print url.value

libxml2:

import libxml2
doc = libxml2.parseFile('foo.xml')
for url in doc.xpathEval('//@Url'):
  print url.content

shk 2010-08-23 13:00:01

ansaurus

tags:

views:

answers:

how to use xpath in python

related questions