tags:

views:

381

answers:

2

I am using the following code to query some XML with XPath I get from a stream.

DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(false);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse(inputStream);
inputStream.close();

XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//FOO_ELEMENT");

Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
    System.out.println(nodes.item(i).getNodeValue());

I have checked the stream for content by converting it to a string - and it's all there - so it's not as if there is no data in the stream.

This is just annoying me now - as I have tried various different bits of code and I still keep getting 'null' being printed at the "System.out.println" line - what am I missing here?

NOTE: I want to see the text inside the element.

+3  A: 

Not an expert in the Java XPath impl tbh, but this might help.

The javadocs say that he result of getNodeValue() will be null for most types of node.

It's not totally clear what you expect to see in the output; element name, attributes, text? I'll guess text. In any XPath impl I have used, if you want the text content of the node, you have to XPath to

//FOO_ELEMENT/text()

Then the node's value is the text content of the node.

The getTextContent() method will return the text content of the node you've selected with the XPath, and any descendant nodes, as per the javadoc. The solution above selects exactly the text component of the any nodes FOO_ELEMENT in the document.

Java EE Docs for Node <-- old docs, see comments for current docs.

Brabster
This has to be it. Also check out http://java.sun.com/javase/6/docs/api/org/w3c/dom/Node.html for the JDK 6 JavaDoc.
Eddie
Yeah sorry shoulda checked the version d'oh!
Brabster
that's not really working.
Vidar
What does "not really working" mean? You are selecting a set of elements, and clearly getting that set because otherwise you'd have a NullPointerException. Per the docs for Node, getNodeValue() returns null when called on an Element.
kdgregory
Sorry I meant the "//FOO_ELEMENT/text()" - does not work for me.
Vidar
"The javadocs say that he result of getNodeValue() will be null for most types of node" - how strange - I wonder what the exact use of this method call is then??
Vidar
It's to do with the model. It works for text nodes, but the elements in your document have children which contain their text content.
Brabster
+2  A: 

In addition to what Brabster suggested, you may want to try

System.out.println(nodes.item(i).getTextContent());

or

System.out.println(nodes.item(i).getNodeName());

depending on what you're intending to display.

See http://java.sun.com/javase/6/docs/api/org/w3c/dom/Node.html

Eddie
Your code examples work! It never occured to me I was querying the node incorrectly - sorry bit of crap mistake - but cheers Eddie.
Vidar
Watch out for getTextContent() - it will return the text content of the node you've selected AND any child nodes too. Might not be what you want.
Brabster