views:

48

answers:

4

Somewhat related to http://stackoverflow.com/questions/1537207/how-to-convert-xml-to-java-util-map-and-vice-versa, only even more generic.

I have an XML document, and I'd like to convert it to a very generic set of key/value pairs (in Java, that is). The basic idea is that we can parse pretty much every XML document and pass it on directly to a JSP file, which can read the values and display them.

Say one has an XML structure as follows:

<root>
  <items>
   <item id="10">Some item here</item>
  </items>
  <things>
   <thing awesome="true">
    <orly-owl hoot="woot" />
   </thing>
  </things>
</root>

The output would be a set of Map objects that both contain values, lists, and other maps. Here's how, ideally, it'd be read in a (pseudo) JSP file:

<c:forEach var="item" items="${root.items}">
  ${item.id}
  ${item.text}
</c:forEach>
<c:forEach var="things" items="${root.things}">
  Is it awesome? ${thing.awesome}
  orly? ${thing.orly-owl.hoot}
</c:forEach>

Basically, it'd be an xml parser of sorts that has a simple set of rules.

For each XML entity:

Does it have subnodes?

add entry to map with node name as key and List (of maps) as value Does it have attributes or value? add entry to map with attribute name as key and attribute value as value

...or something to that degree. I don't really have the data structure in mind properly yet.

So my question is: Is there a ready-made parser that can do this or something like this?

The ones I've found and tried out today all map to a fixed object hierarchy, i.e. you have to create a root object with a List of Item objects with its own properties. This isn't bad per sé (and it can be auto-generated based on a (to be written / designed) DTD object, but it's my current assignment to try out both options. Tried the first, it'll work once those mapping xml files make sense to me and error messages start telling me what I'm doing wrong, but haven't been able to figure out how to do the second (read: write a recursive xml parser (dom or sax) that recurses recursively).

Coherency may be absent in this question, it's five 'o clock.


Edit, after thinking it through some more. It will work (that is, sending Objects to JSP that can contain values, Maps and Lists), however it'll be terribly problematic while parsing, for example in the next example:

<root thing="thine mother">
  <thing mabob="yus" />
  <thing mabob="nay" />
  <items>
    <item id=1" />
  </items>
</root>

In this particular instance, there's two same-named thing-elements under the root. Same-named elements should go into a List. However, at the same level there's an items element, which is a singular element which should go in as a map item. Add to that there's a third named 'element' in the root element, and the whole thing's buggered.

Without analyzing the structure beforehand (and setting a flag like 'there's both same-named and unique-named elements under this particular element'), you cannot assume this. And the last thing I want to do is to force the XML to be according to a particular structure.

My colleague actually suggested running the XML through an XSL so that it'd be 'flatter' (more like database rows), or having the xml output have a maximum depth of one. Not an option, really.

Anyways. Thanks for the suggestions all, it seems this isn't a very plausible solution to the problem - at least not without screwing up basic rules and conventions of XML and Common Sense.

On to the next ideas - having JSP render a Document directly using the XML JSTL library.

+1  A: 

JDOM can certainly provide you with Lists built from the elements. The library has been around for quite some time and is pretty easy to use. http://jdom.org/

Mondain
+1  A: 

Java Architecture for XML Binding (JAXB) should be on your short list. Here's a bief tutorial introduction.

trashgod
+1  A: 

The apache-commons Digester could do this, it is a wrapper around a SAX parser that lets you create rules for unmarshalling data into objects.

OTOH if you want to know how to do recursive parsing you could check out this article for an interesting approach (using a recursive transition network). The idea is you create a network of objects that shows the relationship between the xml elements, and you keep track of where you are in this network as you parse using a stack.

Nathan Hughes
+1  A: 

It seems like the JSTL XML bindings will do exactly what you want.

And the reason that you're unlikely to find anything that exactly meets your requirements using lists and maps is because XML does not neatly translate into lists and maps (mostly because of the question "how do you treat attributes differently than content?").

Anon
""how do you treat attributes differently than content?" I figured I'd treat them the same. Of course, this is only relevant to my particular situation, in which content is usually just plain text.
Cthulhu