views:

455

answers:

3

I am trying to validate an XML against schema using LibXML::Schema Validator CPAN module. In that same script am using XML::DOM CPAN module to parse the XML. I want my script to take XML file validate it against XSD and parse it.

When I try to run the script, after validating against xsd it exits and does not parse XML. I want it parse the XML file if it is valid and generate DOM structure. I would really appreciate if someone could share some insights on it.

#usr/bin/perl -w
use XML::LibXML;

my $schema = XML::LibXML::Schema->new(location =>'export.xsd');
my $parser = XML::LibXML->new;

my $xml    = 'Export.xml';
my $doc    = $parser->parse_file($xml);

eval { $schema->validate( $doc ) };
print $@ if $@;

print "$xml is valid\n";

use XML::DOM;
#use strict;

my $parser = new XML::DOM::Parser;
my $doc = $parser->parsefile ("Export.xml");

my $productOfferingnodes = $doc->getElementsByTagName("productOfferings")->item(0);
my @productOffering = $productOfferingnodes->getChildNodes();
    {
     foreach  my $productOffering(@productOffering) 
     {
        if ($productOffering->getNodeType == ELEMENT_NODE)
      {
       print $productOffering->getNodeName; 
                     }
             }
     }

Error Message:

Schemas parser error : Failed to parse the XML resource 'export.xsd'.

A: 

Okay, here's some insights:

  • You're printing "<xml> is valid" whether or not it actually was validated correctly.
  • You have a race condition when you check the result of $@. The proper way to do this is:

eval { blah; 1 } or do { die "Encountered error: $@"; }

What error message are you receiving after performing the validation? Is it perhaps Can't locate XML/DOM/Parser.pm in @INC...? You're not doing use XML::DOM::Parser; before calling its constructor.

I want it parse the xml file if it is valid

But you haven't proven that it is: You're getting the error "Schemas parser error : Failed to parse the XML resource 'export.xsd'." Perhaps that indicates that export.xsd is not valid? You shouldn't be trying to call parsefile if the validation failed.

Ether
Schemas parser error : Failed to parse the XML resource 'export.xsd'.
Rachel
+1  A: 

It looks like export.xsd is not a valid XML file. Did you try validating it with another tool? Verify that you have a valid XML file, then worry about the rest of the problem. First things first!

brian d foy
+2  A: 

Your code is messy. Your script begins with a line which is intended to be the shebang line but is not. You re-define two variables in this short script. You check if validation failed and merrily go on your way even if it did. These are likely not the cause of your problems, but they do make diagnosing the problem harder. I tried to refactor your code. The code below passes perl -c. Further, I tried it using sample XML and XSD files. As explained on that page, with a missing element, validation failed. When the missing information was added, validation succeeded and expected output was produced.

#!/usr/bin/perl

use strict; use warnings;

use XML::LibXML;
use XML::DOM;

my $xml = 'Export.xml';
my $xsd = 'export.xsd';

if ( my $error = validate_xml_against_xsd($xml, $xsd) ) {
    die "Validation failed: $error\n";
}

my @offerings = get_product_offerings( $xml );
print "$_\n" for @offerings;

sub get_product_offerings {
    my ($xml) = @_;

    my $parser = XML::DOM::Parser->new;
    my $doc = $parser->parsefile($xml);

    my $nodes = $doc->getElementsByTagName("book")->item(0);

    return map {
        $_->getNodeType == ELEMENT_NODE
                         ? $_->getNodeName
                         : ()
    } $nodes->getChildNodes;
}

sub validate_xml_against_xsd {
    my ($xml, $xsd) = @_;

    my $schema = XML::LibXML::Schema->new(location => $xsd);
    my $parser = XML::LibXML->new;

    my $doc = $parser->parse_file($xml);
    eval { $schema->validate( $doc ) };

    if ( my $ex = $@ ) {
        return $ex;
    }
    return;
}

Output:

author
title
genre
price
pub_date
review

By the way, the error message when validation failed was informative: Validation failed: Element 'review': This element is not expected. Expected is (pub_date ).

Sinan Ünür
You should post your paypal account.
Ether
@Ether No, I should learn how to earn reps more efficiently: http://stackoverflow.com/questions/1604124/what-does-the-in-a-za-z0-9-mean/1604128#1604128
Sinan Ünür
Yeah, SO should have a tip jar :) I think they talked about how much that sucked for other sites in one of the first podcasts.
brian d foy