ansaurus

Question

Parsing namespaced XML using libxml-ruby

Answer 1

+2 A:

either of these will work:

/gesmes:Envelope/Cube/Cube - direct path from root
//Cube[@time] - all cube nodes (at any level) with a time attribute

Ok, this is tested and working

arrNS = ["xmlns:http://www.ecb.int/vocabulary/2002-08-01/eurofxref", "gesmes:http://www.gesmes.org/xml/2002-08-01"]
doc.find("//xmlns:Cube[@time]", arrNS)

Zack 2009-11-03 17:29:12

Neither of these actually works, they return no nodes. I tried the first one myself initially to no avail. Interestingly, if I remove all the namespacing and use a root tag of 'test' then '/test/Cube/Cube' does indeed work as expected. Any ideas?

Olly 2009-11-03 20:59:09

See edit above for working code. Took a fair amount of trial and error to get

Zack 2009-11-03 22:14:55

Aha! Thanks for this. I actually figured out a solution which I've just posted, but your solution saves me a link of code :)

Olly 2009-11-03 22:16:53

Answer 2

A:

So I figured this out. The root node defines two namespaces, one with a prefix, one without:

xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01
xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref"

When a prefix is defined, you can quite easily reference the prefix namespaced names. Using the XML from the original question, this XPATH:

/gesmes:Envelope/gesmes:subject

Will return "Reference rates".

Because the Cube nodes are not prefixed, we first need to define a namespace prefix for the global namespace. This is how I achieved this:

doc = XML::Document.file('eurofxref-hist-test.xml')
context = XML::XPath::Context.new(doc)
context.register_namespace('euro', 'http://www.ecb.int/vocabulary/2002-08-01/eurofxref')

Once this is defined, finding the Cube nodes with time attributes is trivial:

context.find("//euro:Cube[@time]").each {|node| .... }

Olly 2009-11-03 22:15:01

ansaurus

tags:

views:

answers:

Parsing namespaced XML using libxml-ruby

related questions