views:

80

answers:

2

Admittedly, I'm a Nokogiri newbie and I must be missing something...

I'm simply trying to print the author > name node out of this XML:

<?xml version="1.0" encoding="UTF-8"?>
<entry xmlns:gd="http://schemas.google.com/g/2005" xmlns:docs="http://schemas.google.com/docs/2007" xmlns="http://www.w3.org/2005/Atom" gd:etag="">
  <category term="http://schemas.google.com/docs/2007#document" scheme="http://schemas.google.com/g/2005#kind"/&gt;
  <author>
    <name>Matt</name>
    <email>Darby</email>
  </author>
  <title>Title</title>
</entry>

I'm trying to using this, but it prints nothing. Seemingly no node (even '*') returns nothing.

  Nokogiri::XML(@xml_string).xpath("//author/name").each do |node|
    puts node
  end
+1  A: 

For some reason, using remove_namespaces! makes the above bit work as expected.

    xml = Nokogiri::XML(@xml_string)
    xml.remove_namespaces!
    xml.xpath("//author/name").each do |node|
      puts node.text
    end

    => "Matt"
Matt Darby
@Matt Darby: The reason is that all your elements are under `http://schemas.google.com/docs/2007` namespace URI. You must declare the binding bettween this URI an some prefix, say `atom`, and then the XPath expresion should be `/*/atom:author/atom:name`
Alejandro
+3  A: 

Alejandro already answered this in his comment (+1) but I'm adding this answer too because he left out the Nokogiri code.

Selecting elements in some namespace using Nokogiri with XPath

The elements you are trying to select are in the default namespace, which in this case seems to be http://www.w3.org/2005/Atom. Note the xmlns=" attribute on entry element. Your XPath expression instead matches elements that are not in any namespace. This is the reason why your code worked without namespaces

You need to define a namespace context for your XPath expression and point your XPath steps to match elements in that namespace. AFAIK there should be few different ways to accomplish this with Nokogiri, one of them is shown below

xml.xpath("//a:author/a:name", {"a" => "http://www.w3.org/2005/Atom"})

Note that here we define a namespace-to-prefix mapping and use this prefix (a) in the XPath expression.

jasso