tags:

views:

78

answers:

3

I've noticed that different XML schemas define child elements differently. Some define them directly under the parent nodes like so:

<parent>
    <foo />
    <foo />
    ...
    <foo />
    <bar />
    <bar />
    ...
    <bar />
</parent>

where as others define container nodes around the child nodes like so:

<parent>
    <foos>
        <foo />
        <foo />
        ...
        <foo />
    </foos>
    <bars>
        <bar />
        <bar />
        ...
        <bar />
    </bars>
</parent>

I haven't had any issues serializing/deserializing into either format as necessary, and I can't think of any reasons to prefer one over the other.

What (if any) are the pros/cons of each approach?

+1  A: 

Well, the second one is more "intuitive". Is more hierarchical. For example you have this one:

<animal>
 <snakes>
   <boa />
   <python />
 </snakes>
 <monkeys>
   <red_ass_monkey />
   <yellow_monkey />
 </monkeys>
</animal>

I would definitely choose the second one when your objects are somehow hirearchical linked, because is more logical. Also when you read yourself to modify something directly in the xml (just in case) you will find your way better, knowing that a python is a snake which is an animal (This one when thing get complicated), better than searching in the whole list.

Timotei Dolean
+1  A: 

one possibility for such a structure would be if you needed to group some <foo/>s together into a repeating set:

`

<foos id=1>
    <foo />
    <foo />
    ...
    <foo />
</foos>
<foos id=2>
    <foo />
    <foo />
    ...
    <foo />
</foos>

`

akf
+1 good point. Unless I wanted to add an attribute/element, there's no way to distinguish groups of similarly named nodes
micahtan
+1  A: 

The first scheme is extensible and not that hard to implement, you just iterate over all childNodes of parent and look whether namespace and element name match anything you can read. However, sometimes, splitting can be favorable, especially when the processing the bars depends on all foos or so.

To borrow an example:

<zoo xmlns="http://example.org/zoo" xmlns:z="http://example.org/zoo"&gt;
<cages>
  <cage name="open-air" />
  <cage name="glass-cage" />
</cages>
<animals>
  <monkey name="Orlan" cage="open-air"/>
  <monkey name="Jeremey" cage="glass-cage"/>
  <snake name="spssshs" cage="glass-cage"/>
  <panda xmlns="http://china.cn/zoo" z:name="Ying Ying" z:cage="open-air"/>
</animals>
</zoo>

So, separating cages and animals makes sense. However, if you had grouped the animals in monkeys and snakes, you would need to add lots of extra processing logic for pandas.

phihag
When processing bars depends on foos, how does adding the container node help? In your example couldn't you process the cage nodes first w/o wrapping them in the cages node?
micahtan
@micahthan Yes, but only if you are loading the complete document before doing any parsing, so you effectively exclude anyone using serial access parsers. Additionally, even if you load the full document first anyway, processing becomes more complicated and slower: You would typically traverse the whole tree twice and need two comparisons to the namespace, too.
phihag
@phihag - Couldn't I define the XSD to use a sequence to force all <cage> elements to the beginning, in which case you still get the benefit of serial parsing w/o having to wrap the <cage> elements inside a <cages> element?
micahtan
@micahtan Of course you can demand that, but that would 1. increase coding complexity (you do have to peek at the next element until it is !cage) 2. make it hard to add zookeepers (Before or after cages?) 3. create incompatibilities because some implementations will accept out-of-order cages by mistake (and not check any schemas) 4. pose additional problems for automated native2xml conversion you'd have to tell to respect that order.
phihag
@phihag, I understand your point, but the minute you run into a scenario like akf posts above, you're back to iterating over all the <cages> container nodes. +1 your initial comment and the continuing discussion -- extensibility is limited by your original node grouping and doesn't allow for changes very gracefully.
micahtan