views:

123

answers:

10
+1  Q: 

Is this valid XML?

I know that syntax is valid, but my question is whether it is logically valid:

<parent>
    <name>John</name>
    <child>Mary</child>
    <child>Lucy</child>
    <child>Hannah</child>
</parent>

or a proper way to do this is:

<parent>
    <name>John</name>
    <child>
        <name>Mary</name>
    </child>
    <child>
        <name>Lucy</name>
    </child>
    <child>
        <name>Hannah</name>
    </child>
</parent>

Is there some document online that definitely says what's right and wrong?

+5  A: 

It depends on what you are going to do with it. The 2nd version is better if there's a chance you will need to store more data about each person in the future.

David
so, you're saying that both ways are valid, but 2nd should be preferred?
Milan Babuškov
If your data storage needs aren't ever going to change, then the 1st option would work fine. The 2nd option is more flexible so it might be better since your data needs almost always changes in the future.
David
+16  A: 

I prefer the latter, as it makes clear that it's the NAME of the child that is Mary, and not that the CHILD ITSELF that is Mary.

I think that using attributes is even better, like so:

<parent name="John">
    <child name="Mary" />
    <child name="Lucy" />
    <child name="Hannah" />
</parent>

because it makes it clear that the name is just a characteristic of the parent/child entity.

Shoko
I prefer this, because otherwise the parent's name has the same logical weight as their children.
Skilldrick
+1 for suggesting to make the XML less bloated :)
OregonGhost
I have seen XML parses in the past in some applications not like the fact that an element has no text node. They consider it non-standard (SAP was that way, I think - I have no idea if it still is), so to have a significant element with no text node can sometimes be taken as an empty tag, even if it has attributes. So this solution makes the XML less bloated, but might not fit all scenarios.
Yishai
I like the way @Shoko made the parent name into an attribute. But could you tell me when it's ideal to make something an attribute rather than an element? I find it very confusing sometimes. Sorry for hijacking the thread :)
Helen Neely
@Yishai: To me that reads: "this solution makes the XML less bloated, but might not fit braindead parsers"
Martinho Fernandes
It also uses less bytes in transmission but if transamission size is really an overriding concern, XML (or certainly XML with lots of metadata like the examples on this page) is probably not the best place to be anyway. EDI like syntax might be smaller like "John|Mary|Lucy|Hannah" and if you accept binary and dictionary you can get really really really tiny... +1
martinr
IMHO if you are going with XML because of its extensibility and ease of change by using attributes you are shooting yourself in the foot, you cannot add subelements to an attribute. OTOH if the messaging is frozen, or you use version numbers in messages and restrict schema changes sensibly and control all consumers of an XML format, attributes might be a good way to go.
martinr
@Martinho, yes, basically. Unfortunately you don't always get to choose the parsers you work with, so it is something to think about.
Yishai
@Helen when the text isn't intended for humans to read ( which requires extra markup in some locales ), when it doesn't matter about the difference between space and carriage return ( attributes are subject to white space normalisation ) and when the value is never going to be presented if the XML is styled with CSS.
Pete Kirkham
+2  A: 

The second is preferable. This makes it clear that name is a property of a child and does not identify the child iteself.

Think of it in terms of classes:

This

class Parent {
    string Name;
    List<Child> Children;
}

class Child {
    string Name;
}

is preferable to

class Parent {
    string Name; 
    List<string> Children;
}

The second option also gives you the flexibility to expand in the future (add a birthday element, for example).

The more subjective debate is whether to use elements or attributes for properties like name, etc.

Finally, add a children element with the child elements contained there.

Jason
+5  A: 

The second one seems to make more sense from an extensibility point of view. What happens if you need to add birthday for a child in the first one? Yes, you could add an XML attribute but you'll end up sooner or later getting stuck on adding a complex type or even basic enum to it.

Also - it may be better to group the child elements under a single 'children' element:

<parent>
    <name>John</name>
    <children>
      <child>
        <name>Mary</name>
        <dob>1970-01-01</dob>
      </child>
      <child>
        <name>Lucy</name>
        <dob>1971-01-01</dob>
      </child>
      <child>
        <name>Hannah</name>
        <dob>1974-01-01</dob>
      </child>
    <children>
</parent>

One more thing: you probably wouldn't group the children under one single parent element, but I've left that in-line with your original.

Wim Hollebrandse
I'm not going to downvote you for not representing your dates in ISO8602 format, but I should.
Robert Rossney
I was thinking of that, but couldn't be bothered. True story.
Wim Hollebrandse
Edited to be conformant Robert. ;-)
Wim Hollebrandse
And presumably you mean ISO 8601...
Wim Hollebrandse
+1  A: 

It is XML, there is no right or wrong. Both your answers are correct, however this is equally valid:

<parent>
  <name>John</name>
  <children>
    <child>Mary</child>
    <child>Lucy</child>
    <child>Hannah</child>
  </children>
</parent>

Which way should you choose? It depends on the task. I do not think any of the ways presented are the most flexible (what if the children have children?)

DanDan
+2  A: 

There is no right answer to this question, of course, but if the child is more complex (or could grow to be more complex) than a single string of text, then the second option is preferable.

In terms of what you usually see, generally in either case all the child elements would be grouped under a children element. In certain visualization environments, it can help to just close away all the children while other elements retain the focus.

Yishai
+1  A: 

Both are right. There are no definition on how to structure your element - it's entirely up to you!

Some people tries to minimice the number of nodes. And those people would maybe create the xml like

<parent name="John">
    <child name="Mary" />
    <child name="Lucy" />
    <child name="Hannah" />
</parent>

But the clue about XML is that you should always make it as easly as possible to read and understand, for human beeings. Screw the comupters, they will always understand you XML, so make it human readble!

qualbeen
An attribute is a node in the XML infoset.
Pete Kirkham
+2  A: 

I would use an alternative between Shoko and DanDan or Wim Hollebrandse:

<parent name="John">
  <children>
    <child name="Mary" />
    <child name="Lucy" />
    <child name="Hannah" />
  </children>
</parent>

because I like the "set" of child which are actually children.

Aif
+1  A: 

There are no standards for how to map data to an XML Schema. There are some common practices, one of which is to use striped XML, so nested element take type/relation/type/relation roles alternately:

<!-- striped style, RDF etc -->
<person>
    <name>John</name>
    <children>
        <person>
            <name>Mary</name>
        </person>
        <person>
            <name>Lucy</name>
        </person>
        <person>
            <name>Hannah</name>
        </person>
    <children>
</parent>

This is very regular, but somewhat more verbose.

It's generally a bad idea to put human readable text into attributes to save space:

<person name="fred"/>

As that precludes use of ruby mark up which is necessary for some forms of internationalisation, as well as being more complicated to render using CSS. If you're only concerned with compact representation and ASCII text, XML might not be the best format to be working with.

Pete Kirkham
+1  A: 

It's worth mentioning that the term "valid" has a specific meaning in XML.

An XML document is valid if and only if it conforms to its DTD or schema. Basically, the universe of strings of text is divided into two categories: those that are well-formed XML, and those that aren't. The universe of well-formed XML documents also is divided into three categories: valid XML documents (which conform to their DTD/schema), invalid XML documents (which don't), and those whose validity cannot be determined (because they don't have a DTD/schema).

As far as your actual question goes, you can only judge the design of an XML document on the basis of its fitness to the purpose for which it is to be used. Are you going to be transforming it with XSLT? Querying it with XPath? Processing it with Linq-to-XML? Processing it with a SAX reader? Deserializing the data in it into objects? Editing it in Notepad? Validating it against a schema? Transporting it over a slow network? All of those things (and there are many more) should influence the design of your XML. There is no one right answer.

Robert Rossney