views:

139

answers:

1

I believe this question might have been previously attempted in 2006 on a different site. But, my current XML/RDF writer (XML::LibXML 1.70) outputs element namespaces in the form of xmlns attributes. This will exclude people using non-namespace aware parsers who just do a look_down for foaf:Person. I'm wondering if anyone knows of an easy way in perl to achive this, firstly, with XML::LibXML. Or through a different means.

Nodes like this:

  <Person xmlns="http://xmlns.com/foaf/0.1/" rdf:ID="me"/>

And, this:

  <name xmlns="http://xmlns.com/foaf/0.1/"&gt;Evan Carroll</name>

Should really look like:

  <foaf:Person rdf:ID="me"/>
  <foaf:name>Evan Carroll</name>

Any ideas? I believe it is technically correct either way, but I'd much rather not depend on other people knowing this. I didn't know it myself yesterday.

+2  A: 

The short answer is that if you already have a namespaceURI and prefix declared you can specify the qualified name (i.e. prefix:localName) as an element name and this will make XML::LibXML avoid redeclaring the namespace. So, modifying the code from the last question gives the following, which does use the desired namespace prefixes:

#! /usr/bin/perl 
use warnings;
use strict;
use XML::LibXML;
my $doc = XML::LibXML::Document->new( '1.0', 'UTF-8' );
my $foaf = $doc->createElementNS( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'RDF' );
$doc->setDocumentElement( $foaf );
$foaf->setNamespace( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#' , 'rdf', 1 );
$foaf->setNamespace( 'http://www.w3.org/2000/01/rdf-schema#' , 'rdfs', 0 );
$foaf->setNamespace( 'http://xmlns.com/foaf/0.1/' , 'foaf', 0 );
$foaf->setNamespace( 'http://webns.net/mvcb/' , 'admin', 0 );
my $node = $doc->createElementNS( 'http://xmlns.com/foaf/0.1/', 'foaf:Person');
$foaf->appendChild($node);
$node->setAttributeNS( 'http://www.w3.org/1999/02/22-rdf-syntax-ns#', 'ID', 'me');
my $node2 = $doc->createElementNS( 'http://xmlns.com/foaf/0.1/', 'foaf:name');
$node2->appendTextNode('Evan Carroll');
$node->appendChild($node2);
print $doc->toString;

It is perhaps worth trying to review what's going on though. XML Namespaces exist to allow multiple vocabularies to be used together, in the same XML document. To achieve this the concept of a namespaceURI (nsURI) is introduced and a mechanism of indicating which nsURI relates to which elements and attributes in an XML document is retrofitted onto XML. To do this use is made of the fact that attribute names starting 'xml' are reserved allowing a special attribute name (xmlns) to be used without the risk of collision.

The general idea is that it is possible to link each vocabulary used in an XML document with a unique nsURI (which is treated as an opaque string). The head element in the XHTML vocabulary is fully defined by {'http://www.w3.org/1999/xhtml':'head'}, and this is clearly different from the head in a (hypothetical) anatomy-ML {'my-made-up-URI':'head'}. The issue is how to embed the nsURI(s) in an XML document and how to link these to the element names.

One way to make the link between a nsURI and an element name is to add the xmlns attribute to the element. For example:

<name xmlns="http://xmlns.com/foaf/0.1/"&gt;Evan Carroll</name>

says that 'name' is in the 'http://xmlns.com/foaf/0.1/' namespace. Namespace declarations are inherited by children, so 'age' is in the same namespace:

<name xmlns="http://xmlns.com/foaf/0.1/"&gt;Evan Carroll<age years='21'/></name>

This can work well and be quite compact. However, it doesn't work for attributes and can get messy if lots of sibling nodes need to change namespace from their common parent. To deal with both of these problem the NamespacePrefix (nsPrefix) is introduced. This gives the colon special meaning. The idea is to link the nsURI to a string that is used in the current document. This doesn't have any special meaning outside the document and shouldn't be specified by the vocabulary (but it sometimes is, discussion for elsewhere). It's particularly common for all nsURI's to be declared on the root element. The syntax is to declare the namespace thus:

xmlns:prefix="http://xmlns.com/foaf/0.1/"

and use it in attribute and element names by prepending the nsPrefix to the name:

<prefix:name prefix:attribute='value'/>

Because the exact value of nsPrefixes are not supposed to matter, API's generally don't make accessing / setting them very easy (Xpath is a good example). Having namespaces leads to some constraints on the document that should be treated as errors, using a prefix that isn't defined is an example. But such a document can be well-formed according to the XML specification (remember namespaces are retrofitted). You can describe such a document as 'not being namespace well-formed'.

Parsing a document that uses namespaces with a parser that dosn't know anything about namespaces is obviously easier if you know the namespace prefixes used in advance. But this is quite a brittle solution as namespace prefixes can change in odd places as an XML document is repeatedly processed. Most parsers are namespace aware.

Andrew Walker
Thanks a ton. I really appreciate your answers.
Evan Carroll