tags:

views:

140

answers:

1

I've found this on http://www.perlmonks.org/?node_id=606909

looking by qualified name ...
In this case you can call findnodes method on any node, you don't need the XML::LibXML::XPathContext with its prefix => namespace mapping: $doc->findnodes('///info/fooTransaction/transactionDetail/[name() = "histFile:transactionSummary"]/*');

In which way I have to edit my xpath to get my script working without XPathContext?

#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use XML::LibXML;


my $parser = XML::LibXML->new;
$parser->recover_silently( 1 );

my $doc = $parser->parse_file( 'http://www.heise.de/' );

my $xc = XML::LibXML::XPathContext->new( $doc->getDocumentElement );
$xc->registerNs( 'xmlns', 'http://www.w3.org/1999/xhtml' );

my $nodes = $xc->findnodes( '//xmlns:h2/xmlns:a' );
for my $node ( $nodes->get_nodelist ) {
    say $_->getName, '=', $_->getValue for $node->attributes;
}
+1  A: 

Follow the same model as given in the article. If you want to test the textual name of the node, instead of considering what URI the node's namespace is mapped to, then call name and do a string comparison.

//*[name() = "xmlns:h2"]/*[name() = "xmlns:a"]

For that expression to match anything, though, there would need to be nodes in the document literally named xmlns:h2. You'd need to have a document like this:

<xmlns:h2>
  <xmlns:a>header</xmlns:a>
</xmlns:h2>

The page you've linked to doesn't look like that, though. It uses ordinary HTML node names like h2 and a, not xmlns:h2. The simple names are indeed in the xmlns namespace, but only because that's configured as the default namespace for the document. Since the nodes aren't named with a namespace prefix, don't include that prefix in your name strings:

//*[name() = "h2"]/*[name() = "a"]

A further change you could make, in case some nodes use the xmlns prefix when others don't, is to use local-name instead of name; then it will strip any namespace prefix that's present.

//*[local-name() = "h2"]/*[local-name() = "a"]
Rob Kennedy
I tried this but I suppose this is not the way it works:my $parser = XML::LibXML->new; my $doc = $parser->parse_file( 'http://www.heise.de/' ); my $nodes = $doc->findnodes( '//*[name() = "xmlns:h2"]' ); say $_->nodeName for $nodes->get_nodelist;
sid_com
Reading foreign language is not the same as reading first language:"Since the nodes aren't named with a namespace prefix, don't include that prefix in your name strings"
sid_com
The names of the nodes in your Web site are just `a` and `html`. The names in the XML you're parsing do not include the namespace prefix `xmlns`. The original XPath expression checks whether the `name()` function returns the string `xmlns:a`, but that is not the node's name, so the comparison fails. The name is just `a`; it includes no prefix for the XML namespace.
Rob Kennedy