views:

876

answers:

1

I have an example Perl script which I am trying to load and validate a file against a schema, them interrogate various nodes.

#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;

my $filename = 'source.xml';
my $xml_schema = XML::LibXML::Schema->new(location=>'library.xsd');
my $parser = XML::LibXML->new ();
my $doc = $parser->parse_file ($filename);

eval {
    $xml_schema->validate ($doc);
};

if ($@) {
    print "File failed validation: $@" if $@;
}

eval {
    print "Here\n";
    foreach my $book ($doc->findnodes('/library/book')) {
     my $title = $book->findnodes('./title');
     print $title->to_literal(), "\n";

    }
};

if ($@) {
    print "Problem parsing data : $@\n";
}

Unfortunately, although it is validating the XML file fine, it is not finding any $book items and therefore not printing out anything.

If I remove the schema from the XML file and the validation from the PL file then it works fine.

I am using the default namespace. If I change it to not use the default namespace (xmlns:lib="http://libs.domain.com" and prefix all items in the XML file with lib and change the XPath expressions to include the namespace prefix (/lib:library/lib:book) then it again works file.

Why? and what am I missing?

XML:

<?xml version="1.0" encoding="utf-8"?>
<library xmlns="http://lib.domain.com" 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://lib.domain.com .\library.xsd">
    <book>
     <title>Perl Best Practices</title>
     <author>Damian Conway</author>
     <isbn>0596001738</isbn>
     <pages>542</pages>
     <image src="http://www.oreilly.com/catalog/covers/perlbp.s.gif" width="145" height="190"/>
    </book>
    <book>
     <title>Perl Cookbook, Second Edition</title>
     <author>Tom Christiansen</author>
     <author>Nathan Torkington</author>
     <isbn>0596003137</isbn>
     <pages>964</pages>
     <image src="http://www.oreilly.com/catalog/covers/perlckbk2.s.gif" width="145" height="190"/>
    </book>
    <book>
     <title>Guitar for Dummies</title>
     <author>Mark Phillips</author>
     <author>John Chappell</author>
     <isbn>076455106X</isbn>
     <pages>392</pages>
     <image src="http://media.wiley.com/product_data/coverImage/6X/07645510/076455106X.jpg" width="100" height="125"/>
    </book>
</library>

XSD:

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns="http://lib.domain.com" xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" targetNamespace="http://lib.domain.com"&gt;
    <xs:attributeGroup name="imagegroup">
     <xs:attribute name="src" type="xs:string"/>
     <xs:attribute name="width" type="xs:integer"/>
     <xs:attribute name="height" type="xs:integer"/>
    </xs:attributeGroup>
    <xs:element name="library">
     <xs:complexType>
      <xs:sequence>
       <xs:element maxOccurs="unbounded" name="book">
        <xs:complexType>
         <xs:sequence>
          <xs:element name="title" type="xs:string"/>
          <xs:element maxOccurs="unbounded" name="author" type="xs:string"/>
          <xs:element name="isbn" type="xs:string"/>
          <xs:element name="pages" type="xs:integer"/>
          <xs:element name="image">
           <xs:complexType>
            <xs:attributeGroup ref="imagegroup"/>
           </xs:complexType>
          </xs:element>
         </xs:sequence>
        </xs:complexType>
       </xs:element>
      </xs:sequence>
     </xs:complexType>
    </xs:element>
</xs:schema>
+3  A: 

From the XML::LibXML docs:

A common mistake about XPath is to assume that node tests consisting of an element name with no prefix match elements in the default namespace. This assumption is wrong - by XPath specification, such node tests can only match elements that are in no (i.e. null) namespace. ...(and later)... ...The recommended way is to use the XML::LibXML::XPathContext module

So, from the perspective of XPath, there is no "default" namespace...for any non-null namespace, you have to specify it in your XPath. The XML::LibXML::XPathContext module lets you create a prefix for any namespace to use in your XPath expression.

runrig