views:

719

answers:

4

The problem: Let's say there is an XML file that contains both the data and the hierarchy of certain elements of interest to the application:

  <root>
    <node title="lvl1Node">
     <node title="lvl2Node">
      <node title="lvl3Node"></node>
     </node>
    </node>
    <node title="lvl1Node2"></node>
    <node title="lvl1Node3">
     <node title="lvl2Node2">
      <node title="lvl3Node2">
       <node title="lvl4Node"></node>
      </node>
     </node>
    </node>
  </root>

Now let's say your application needs to provide an API to retrieve these nodes. You need to write a method that returns the nodes without losing information about their hierarchy.

My question is how would you go about it ? What kind of data type would you use. A tree data type is the obvious answer but it not provided in the standard Collections API and writing it yourself is always a last resort (programmers are lazy, reinventing the wheel etc).

I also thought of an ArrayList where each item is either an Object (for node without subnodes) or an Arraylist (for node with subnodes) but I like generics and this feels too much like a hack. Is there a cleverer way to do it ?

A: 

If each item has a title, then you can use a Map instead of an List. Makes accessing each item more intuitive (Using the name of the node instead of an index).

CookieOfFortune
If I use a map how do I know which node is a subnode and who's its parent ? I lose information on hierarchy and in addition I'll have to worry about unique names (titles).
javito
You can place a map in the map to maintain the hierarchy.
CookieOfFortune
A: 

The last time I needed a Tree structure - I cheated and used the Tree Model. Its not pretty (involves pulling in Swing) but it works really well! Its a shame that they didn't put this in the Collections API

Fortyrunner
+1  A: 

The first question you should be asking yourself is how do you need to access the data? Depth-first iterations? Search for specific values?

At first glance, this is a tree of nodes where each node can have zero or more children.

So it goes something like this:

class Node {
  Node parent;
  List<Node> children;
}

It's like a linked list, but each node can branch out into an arbitrary number of children. If you need to find items directly by id the best bet is to keep a seperate hashmap index.

Mark Renouf
...which looks suspiciously like the DOM.
Steven Huwig
I don't know what exactly you mean by DOM but sometimes there is a need to parse a document in one place and extract data from it and make it available elsewhere in your code. This is actually the best solution since it simulates the structure and stores the data as well.
javito
I mean that there is already an API to access XML in that way, and it's called the Document Object Model. http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JAXPDOM.html
Steven Huwig
The benefit here is you can populate your object with fields of specific types, and use something like XStream or JAXB to automate the XML<->Object conversion. That way you are writing loads of boilerplate type conversion to/from Strings, etc.
Mark Renouf
Isn't a good idea to abstract away the access from the actual data ? With tweakt's data type I can return a Node and the client code does not need to know where it came from.ps: I am not dealing with XML per se, in my app the data come from a jcr repository.
javito
+1  A: 

Err. What exactly do you need that isn't covered by DOM?

John Nilsson