tags:

views:

1648

answers:

5

What Ruby library can be used to select attribute using XPath, and to use it as the starting point for other XPath queries.

Example:

<root>
  <add key="A" value="B" />
  <add key="C" value="D" />
  <add foo="E" bar="F" />
</root>

Desired code:

get_pair "//*/@key", "../@value"
get_pair "//*/@foo", "../@bar"

Expected output:

"A" "B"
"C" "D"
"E" "F"

Pseudo implementation:

def get_pair(key, value)
  xml_doc.select[key].each do |a|
    puts [a, a.select[value]]
  end
end
+2  A: 

Your starting point would be REXML

The "challenge" here is how to treat an attribute node as a child node, and this can be done by using singleton methods, then everything else follows naturally:

require "rexml/document"
include REXML  # so that we don't have to prefix everything with REXML::...

def get_pair(xml_doc, key, value)
  XPath.each(xml_doc, key) do |node| 
    if node.is_a?(Attribute)
      def node.parent
        self.element
      end
    end
    puts "\"#{node}\" \"#{XPath.first(node, value)}\""
  end
end

xml_doc = Document.new <<EOF
  <root>
    <add key="A" value="B" />
    <add key="C" value="D" />
    <add foo="E" bar="F" />
  </root>
EOF

get_pair xml_doc, "//*/@key", "../@value"
get_pair xml_doc, "//*/@foo", "../@bar"

produces:

"A" "B"
"C" "D"
"E" "F"

Cheers, V.

vladr
+1  A: 

I would also suggest looking at Hpricot ... it is a very expressive HTML and XML parsing library, inspired by jQuery.

Toby Hede
+1  A: 

And if you will be parsing a decent amount of data in any area where performance matters, then you will need libxml-ruby. REXML and Hpricot are good, but I recently had to make the switch on my own server for some parsing stuff because it was about 1200% faster.

Squeegy
A: 

rexml, which comes with ruby will do what you want:

require 'rexml/document'
include REXML
xml = Document.new('<root><add key="A" value="B" /><add key="C" value="D" /><add foo="E" bar="F" /></root>')
xml.root.each_element_with_attribute('key'){|e| puts "#{e.attribute('key')} #{e.attribute('value')}"}
mletterle
A: 

Apparently Nokogiri is the fastest Ruby XML parser

See http://www.rubyinside.com/nokogiri-ruby-html-parser-and-xml-parser-1288.html

Was using it today and it's great.

For your example:

doc = Nokogiri::XML(your_xml)
doc.xpath("/root/add").map do |add|
  puts [add['key'], add['value']]
end

Edit: It unsurprisingly turns outthat the claim that Nokogiri is faster is not uncontroversial.

However, we have found it more stable than libxml in our production environmenty (libxml was occasionally crashing; just swapping in Nokogiri has solved the issue)

DanSingerman
It's described as "slightly slower than libxml-ruby" in the http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/ comments section.
Andrew Grimm