tags:

views:

384

answers:

2

Is there a way to set Java's XPath to have a default namespace prefix for expressons? For example, instead of: /html:html/html:head/html:title/text()", the query could be: /html/head/title/text()

While using the namespace prefix works, there has to be a more elegant way.

Sample code snippet of what I'm doing now:

Node node = ... // DOM of a HTML document
XPath xpath = XPathFactory.newInstance().newXPath();

// set to a NamespaceContext that simply returns the prefix "html"
// and namespace URI ""http://www.w3.org/1999/xhtml"
xpath.setNamespaceContext(new HTMLNameSpace());

String expression = "/html:html/html:head/html:title/text()";
String value = xpath.evaluate(query, expression);
+1  A: 

I haven't actually tried this, but according to the NamespaceContext documentation, the namespace context with the prefix "" (emtpy string) is considered to be the default namespace.


I was a little bit too quick on that one. The XPath evaluator does not invoke the NamespaceContext to resolve the "" prefix, if no prefix is used at all in the XPath expression "/html/head/title/text()". I'm now going into XML details, which I am not 100% sure about, but using an expression like "/:html/:head/:title/text()" works with Sun JDK 1.6.0_16 and the NamespaceContext is asked to resolve an empty prefix (""). Is this really correct and expected behaviour or a bug in Xalan?

jarnbjo
Per the XPath 1.0 spec (http://www.w3.org/TR/1999/REC-xpath-19991116#node-tests), a node test can use a "QName", which is defined by the Namespace spec (http://www.w3.org/TR/REC-xml-names/#NT-QName). The prefix of a QName is an NCName, which must start with a letter or underscore (http://www.w3.org/TR/REC-xml-names/#NT-NCName). All of which is to say that the JDK evaluator is broken -- although as a practical matter, unlikely to get fixed.
kdgregory
+2  A: 

Unfortunately, no. There was some talk about defining a default namespace for JxPath a few years ago, but a quick look at the latest docs don't indicate that anything happened. You might want to spends some more time looking through the docs, though.

One thing that you could do, if you really don't care about namespaces, is to parse the document without them. Simply omit the call that you're currently making to DocumentBuilderFactory.setNamespaceAware().

Also, note that your prefix can be anything you want; it doesn't have to match the prefix in the instance document. So you could use h rather than html, and minimize the visual clutter of the prefix.

kdgregory
Thanks for the suggestions. I ended up turning off namespace awareness since it was not necessary for this simple case (i.e. only ever working with one namespace).
Rob