tags:

views:

92

answers:

4

Hi guys,

I guess this question of mine is pretty basic but since Ive never done it before or havent come across anything good on the internet while i searched for this, here goes ...

I have an XML that I want my java code to read. The sample of the XML would be as follows --

<getLabel labelId="BLAH">

<dataObject name="packageInfo">
  <map>
    <entry><string>shipment_id</string><string>143486104007</string></entry>
    <entry><string>package_id</string><string>1</string></entry>
    <entry><string>station_id</string><string>308</string></entry>
            <entry><string>include_invoices</string><string>true</string></entry>      </map>
</dataObject>

<dataObject name="pcsp">
  <map>
    <entry><string>shipment_id</string><string>143486104007</string></entry>
    <entry><string>package_id</string><string>1</string></entry>
    <entry><string>shipper_id</string><string>8429098020</string></entry>
    <entry><string>scale_weight</string><string>3.01</string></entry>
    <entry><string>bill_weight</string><string>4.0</string></entry>
    <entry><string>package_type</string><string>BOX</string></entry>
    <entry><string>station_id</string><string>308</string></entry>
  </map>
 </dataObject>
</getLabel> 

Though, the size of the data within the XML is not fixed, my ultimate motive is to read the package_id and the shipment_id within the 'pcsp' dataobject section of the xml and store the values, '143486104007' and '1' in variables.

Is there any easy way to do that in java without having to use any external API/XMLReader ? IF not, is there an easy to use external API - open source that would help my cause?

Regards p1nG

A: 

Unfortunately, I don't know of any way to do it with just Java but there are a ton of external libraries that do it. If you're looking for DOM style parsing, check out Xerces from Apache. Xerces will also do SAX parsing if that's what you're looking for. If you are looking to do something like automatic object creation check out Xstream.

Rereading your question, SAX would probably be the easiest as you're only looking to store a couple of the values. To be honest, I think Xstream is probably overkill and would require a substantial amount of work to get all of the mappings and such right when writing a simple SAX parser subclass that handles the specific tags would be the easiest/quickest, but that's just my opinion.

Chris Thompson
+1  A: 

I'll use The Streaming API for XML (StAX), since is both faster than usual XML parsers libraries and easy to use . Here is a nice example of how to get started with it: http://www.ibm.com/developerworks/xml/library/x-stax1.html

StudiousJoseph
Plus, as of JDK 1.6, reference implementation is actually bundled with JDK as well if that matters.
StaxMan
+2  A: 

org.xml.sax.XMLReader as well as an implementation of that interface (actually a modified Apache Xerces, last I checked) are included with Java SE. Just use org.xml.sax.helpers.XMLReaderFactory.createXMLReader() to create an XMLReader.

Laurence Gonsalves
+2  A: 

You could try using XPath, it seems to fit your problem. For example:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("a.xml");
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xpath = xPathFactory.newXPath();
XPathExpression expr = xpath.compile("//dataObject[@name='pcsp']/map/entry/string[text()='shipment_id']/../string[2]/text()");
NodeList result = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.out.println(result.item(0).getNodeValue());

prints 143486104007.

A little explanation on the xPath expression used:

  • //dataObject[@name='pcsp'] - select from all dataObject tags with atribute name equals to 'pcsp'
  • /map/entry - select child map and then child entry
  • /string[text() = 'shipment_id'] - select child string tag with inner text equal to 'shipment_id'
  • /.. - select parent of previous node (in this case the entry that has the shipment_id string
  • /string[2] - select second child string tag
  • / text() - select inner text of previous tag (143486104007)

The evaluation returns a NodeList of all the nodes that match the expression (in this case, just 1).

The classes are all included in the jdk.

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;

and you'll need to do some exception handling as well on the code.

Now, this only work if you need to get certain tags, which is what I understood you wanted. If you want to read all the values in the xml then you're better off using something like SAX or DOM.

Andrei Fierbinteanu