This isn't really a full answer but...
The important thing to note is that the attribute xml:lang doesn't have a colon in it. The attribute 'lang' is the 'xml' namespace which is not quite the same thing. The xml namespace is (in some ways) 'built-in'.
Secondly, I think you probably mean:
'/html[boolean(string(normalize-space(@xml:lang))) = true()]'
as truth and falsehood are not strings in xpath.
Now, I've run the following script in perl, using XML::LibXML and it works just fine:
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $parser = XML::LibXML->new;
my $xml = $parser->parse_file('test.html');
my ($node) = $xml->findnodes('/html[boolean(string(normalize-space(@xml:lang))) = true()]');
print $node->textContent, "\n";
using this as my input:
<?xml version='1.0'?>
<html xml:lang='en-uk'>
<head><title>boo</title></head>
<body><p>boo</p></body>
</html>
That prints out the expected output ("boo\nboo
").
I wonder if you are using a parser that isn't fully namespace aware. Also, what do you mean by 'works'? Are you trying to find out if an html element has an xml:lang attribute?
If you are, this would probably be a better statement:
'/html[@xml:lang]'
Nic Gibson
2009-04-20 13:00:33