tags:

views:

214

answers:

9

Possible Duplicate:
Should I use Elements or Attributes in XML?

I have never been able to figure out when to use xml attributes. I always use elements. I just read this w3schools article. The article states that it is bad practice to use attributes because:

  • attributes cannot contain multiple values (child elements can)
  • attributes are not easily expandable (for future changes)
  • attributes cannot describe structures (child elements can)
  • attributes are more difficult to manipulate by program code
  • attribute values are not easy to test against a DTD

The only exception that it states is when you are assigning an id to a tag.

Is this correct? Why do attributes even exist then? Was it a design error with xml? Is there something I am missing here?

The only reason I could think of for using attributes would be for one to one relationships. ie: name. But it would have to be a one to one relationship to something that is a primitive (or string). Because it would be important that in the future you would not want to break it up into several differnt sections. ie:

<date> May 23, 2001 </date>

to:

<date>
   <month> May </month>
   <d> 23 </d>
   <yr> 2001 </yr>
</date>

Because this would not be possible with an attribute.

Bonus question: In the date example would it be possible to do something like this:

<date>
   <default> May 23, 200 </default>
   <month> May </month>
   <d> 23 </d>
   <yr> 2001 </yr>
</date>

To give future applications more (or different) information while still offering existing apps the same format? Or would you have to do this:

<date> May 23, 2001 </date>
<NEWdate>
   <month> May </month>
   <d> 23 </d>
   <yr> 2001 </yr>
</NEWdate>
+1  A: 

All those points from the w3schools article are absolutely valid and correct. I agree - I hardly ever use attributes in my XML documents.

The only time I would use them might be when I need to identify an entity, e.g.

<Customer Id="123123">
 ....
</Customer>

But even here, it's a toss-up. You could just as easily put that ID into an <ID>123123</ID> element.

Furthermore, in my case, since the WCF DataContractSerializer doesn't support XML attributes (for performance reasons), that's one more reason not to use them (much):

marc_s
+9  A: 

Attributes are good when you want to attach information to other information, perhaps to describe how the information should be interpreted. For example:

<speed unit="mph">65</speed>
Guffa
that is interesting but why not make it a separate element?
sixtyfootersdude
You could do it with elements -- the machine doesn't care. This is a stylistic issue, for the benefit of humans.
Drew Wills
@sixtyfootersdude: Then you would have to nest it a level deeper by adding unit and value elements in the speed element. Alternatively add it at the same level with a name like speedUnit, but then it's still not as clearly attached to the speed element.
Guffa
Why split information from how it should be interpreted? <speed>65mph</speed> is a perfectly good XML element.
Dour High Arch
Other than 65mph put's the unit of measure into the value and makes the 65 useless (without parsing) as a number. If you put unit into either an element or attribute you are able to cleanly handle kph, mph, ft/s without having to have a weird parser to post process the data loading.
Matthew Whited
@Drew: the machine could care if there are millions of records over a slow connection. Not everyone has Gbps internet to their house, some places still have to deal with dialup.
Matthew Whited
@dourhigharch: the largest difference is that your example makes it very difficult to transform with XSLT or query against with XPath. Consider trying to construct an XPath expression that looks for speeds greater than 65 mph or a transform that converts from mph to kph.
D.Shawley
+1  A: 

The points you list about elements are correct, and I would add the following:

  • elements generally make prettier (more readable) diffs when you need to compare revisions of a file

But sometimes using an element to model a data point is overkill -- particularly when you have a lot of small, heterogeneous data points within a single parent element. Using attributes for simple things can improve readability. Some will probably argue that XML isn't readable or meant to be read/edited by humans... but I do it all the time.

Consider this example (basic hyperlink):

<a href="http://www.htmlhelp.com/" title="Help Information" target="_top">Web Design Group</a>

Would you like it if you had to write or read it this way instead?

<a>
    <href>http://www.htmlhelp.com/&lt;/href&gt;
    <title>Help Information</title>
    <target>_top</target>
    <text>Web Design Group</text>
</a>

To me that looks like a lot of noise.

Drew Wills
A: 

"Why do attributes even exist then?"

To allow for more concise XML code, just for save your typing. And, of course, any XML file containing attributes

<element attr1="val1" attr2="val2" ... attrN="valN">
   <nestedElement>
     ...
   </nestedElement>
</element>

can be easyly converted to an "attributeless" one:

<element>
       <attributes>
         <attr1>val1</attr1>
         <attr2>val2</attr2>
         ...
         <attrN>valN</attrN>
       </attributes>
       <nestedElement>
         ...
       </nestedElement>
    </element>
Igor Korkhov
"Terseness in XML markup is of minimal importance." - Extensible Markup Language (XML) 1.0 (Fifth Edition)
Pete Kirkham
+1… @Pete: That is opinion and not fact (even if it is in the whitepaper). There are many reasons that space is still an issue. Mobile devices, embedded hardware, slow data links. Try dealing with bloated XML over a single wire serial connection between an Ethernet interface and a flash memory chip using only a 100MHz processor and 4MB of RAM.
Matthew Whited
It is the option of the people who designed XML, who are authoritative in the design of XML. Attributes in XML do not exist to provide a terser representation, whether or not that can also be used to that effect. To say why the are in XML, you have to look at SGML and think about applying markup to text rather than representing data structures tersely.
Pete Kirkham
A: 

This question have already made me scratch my head too. For me, it's a matter of semantics. It seems more natural for me to do

<page size="a4">

than

<page>
  <size>a4</size>
</page>
Jaú
+1  A: 

attributes are just that attributes of the element. if you need to nest multiple elements then you use elements. In your date example I usually just use attributes, because it is smaller.

<date month="12" day="31 year="2009"/>

if much easier to deal with and smaller to store and send over the wire as well, and arguably easier for a human to read as well. A date will never have multiple days, months or years so there is no reason to make them elements.

fuzzy lollipop
That is a good point. Attributes must be a one to one relationship.
sixtyfootersdude
A: 

Think of a block of contact information...

<!-- attribute version -->
<person name="Matt" age="27">
    <phone type="mobile" value="1234567890" />
    <phone type="work" value="1234560987" />
    <address type="home" 
             city="NoWhere" 
             state="OH" 
             street="123 Lost Ave." 
             zipcode="12345" />
</person>

<!-- element version -->
<person>
    <name>Matt</name>
    <age>27</age>
    <phone>
        <type>mobile</type>
        <value>1234567890</value>
    <phone>
        <type>work</type>
        <value>1234560987</value>
    <address>
        <type>home</type> 
        <city>NoWhere</city>
        <state>OH</state>
        <street>123 Lost Ave.</street>
        <zipcode>12345</zipcode>
</person>

... you could expand these out into elements. However if you are processing hundreds, and possibly millions of records, the extra overhead from the end tags can bloat the files. This could cause problems on memory/processor constrained systems and/or slow datalinks. Littering your XML with elements can also make it much more difficult to read and understand your XML visually. While the visual experience of data may not matter for transfer and storage, and can be very important for configuration and maintenance.

Another problem that can come out of using elements from everything is when you try to use data from outside of your code base; you have a much more difficult time knowing if the elements can repeat or if they should only contain a simple piece of information. Yes, you can constrain this with XSD and DTD but that is typically more difficult then just making the XML easy to understand.

As for your bonus question... Versioning of XML schemas would depend on the platform you are developing against and how strict your code and platform are against schema. XML (and binary files) can be very flexible... that really why XML is eXtensible.

Matthew Whited
bonus: that is what I thought as well. Just wondered if there was some standard.
sixtyfootersdude
Onces you start playing at the level of abstraction between difference data formats across different apps and versions, it gets pretty difficult to tweek data in a "standard" way. But that's why we have XSLT.
Matthew Whited
A: 

I generally use attributes for the minimum set of fields that make a node unique. In other words, they represent the primary key. This makes some things easier if you need to correlate XML with a relational database.

knipknap
A: 

Don't forget that attributes are parsed as part of the start tag. This means while you're parsing, you get those values right away, you don't have to wait for the close tag. Plus, you don't invoke all the parsing events (if you're doing stream parsing) for all the element tags.

I prefer to use attributes for metadata about the comtained element. For example, I like to express dates as <date format="dd-MMM-yyyy">20-Jan-2010</date>. If you've got unambiguous data elements, go ahead and just make them attributes. <name first="Tom" last="Jones"/> works for many cases.

TMN