tags:

views:

98

answers:

4

There are two ways to specify lists of values in XML. Variant 1:

<Database Name="myDatabase">
    <Table Name="myTable1" />
    <Table Name="myTable2" />
    <Table Name="myTable3" />
    ...
</Database>

Variant 2:

<Database Name="myDatabase" Tables="myTable1 myTable2 myTable3 ..." />

Clearly, Variant 1 is cleaner and can be extended more easily, but im many cases Variant 2 is more readable and "user-friendly".

When using Variant 2, what should be used as the separator? The XML Schema standard seems to prefer whitespace, whereas some real-world examples use commas instead.

Is there a particular reason to choose one over the other (assuming that the values contain neither whitspace nor commas)?

+2  A: 

As a programmer, I seem to ignore whitespace when looking at the list. It looks like something is wrong with it.

I would go for Commas seem more natural to me as separators.

On the other hand, XML (or other markup languages) prefer whitespace as separators, just like in

<Database Name="myDatabase" Tables="myTable1 myTable2 myTable3 ..." />

There's no comma separating the different attibutes (Name, Tables) etc.

There's no clear advantage for any of the options in terms of parsing, since in both cases you will need to get rid of extra whitespace between the items (myTable1 , myTable2,myTable3) or (myTable1 myTable2 myTable3).

Sakin
+1  A: 

Your Variant 1 allows much more effective extraction of structural information by XPath and parsing.

Variant 2 may be slightly easier reading for a human, but I'd definitely go for #1.

If you do choose to use the attribute, I doubt the choice of separator matters much.

Except you probably want to avoid using the '<' character, which is invalid in attribute values...

Don Roby
+2  A: 

For splitting the value, which should be done with a programming language ideally (even if XPath and XSLT do provide functions), white space seems like a natural separator in XML, but CSS uses a semicolon, and another common separator is a comma, though neither is nearly as XML-friendly as white space since the W3C XML Schema recommendation has something for lists (<xs:list>) in which the allowed values are separated by white space.

To make that entire issue pointless, go with variant 1. As mentioned by someone else, it is far more accessible, and it will also make using it for programming that much easier.

Dustin
+1  A: 

Neither!

I think you should use Variant 3:

<Database Name="myDatabase"> 
    <Tables>myTable1 myTable2 myTable3</Tables>
    ... 
</Database> 

Seriously!

If you have a bunch of metadata that belong to the entity represented by the <Database> element, then there's little justification for putting "some" of that data in an attribute while "other" data is represented as additional child elements, as indicated by ....

I think the main criterion for the shape of your XML has to be utility:

  • If you have other data pertaining to each table - for example, the names and types of the columns in the table - then you will want the table things to be elements, so that you can attach all that data to a particular table, and you can apply a schema to that data. That means Variant 1.

  • If the tables are the sole "leaf" nodes in the representation - in other words if there's no subsidiary data defined for each table, and there's no additional data defined for the Database - then storing the list of tables as a xs:list in an XML attribute (Variant 2) seems fine to me.

  • if the tables are leaf nodes, but there are other leaf nodes at the same level, then you should use Variant 3.

The objection that some people have with "the way it looks" seems quite unfounded and to me, similar to the common rationalization for why infant boys need to be circumsized. There's an irrational bias there.

Cheeso
I want to stab you with a fork :)
pst
except.... *I'm serious!* I updated my answer with some addl detail.
Cheeso