views:

893

answers:

6

What are the pros / cons in DTD and XML Schemas (I'm not even sure what the official name of the latter is!)? Which is better? Why do we need two ways to do the same thing? Seems dumb, IMHO. :)

Thanks, LES

Edit: I found this in an article I was reading, which is what prompted me to ask the question:

Why W3C XML Schema Language?

The W3C XML Schema Language is not the only schema language. In fact, the XML specification describes document-type definitions (DTDs) as the way to express a schema. In addition, pre-release versions of the JAXB Reference Implementation worked only with DTDs -- that is, not with schemas written in the XML Schema Language. However, the XML Schema Language is much richer than DTDs. For example, schemas written in the XML Schema Language can describe structural relationships and data types that can't be expressed (or can't easily be expressed) in DTDs. There are tools available to convert DTDs to the W3C XML Schema Language, so if you have DTD-based schemas that you used with an earlier version of the JAXB Reference Implementation, you can use these tools to convert the schemas to XML Schema Language. http://java.sun.com/developer/technicalArticles/WebServices/jaxb/#binsch

I guess I would like examples that illustrate why XML-Schema is better (if it indeed is).

Thanks again, LEW

+1  A: 

From http://weblogs.asp.net/rchartier/archive/2006/03/21/440782.aspx

  • DTD's are not namespace aware.

  • DTD's have #define, #include, and #ifdef -- or, less C-oriented, the ability to define shorthand abbreviations, external content, and some conditional parsing.

  • A DTD describes the entire XML document (even if it leaves "holes"); a schema can define portions.

  • XSD has a type system.

  • XSD has a much richer language for describing what element or attribute content "looks like." This is related to the type system.

  • You can put a DTD inline into an XML document, you cannot do this with XSD. This means DTD's are more secure (you only have to protect one bytestream -- the xml/dtd -- and not multiple).

  • The official definition of "valid XML" requires a DTD. Since this may be impractical, if not impossible, you often have to settle for schema-valid, which is not quite the same.

For my part, it's pretty straightforward to write a validator for some XML if you have an XSD. I haven't seen this with a DTD, although I'm sure it exists.

Robert Harvey
You can put XSD inline with XML. Just use the right namespace, and nest the schema inside the document. WSDL files do this commonly.DTDs are subject to DoS attacks. See http://en.wikipedia.org/wiki/Billion_laughs
lavinio
@lavinio, I think you're right about inlining XSD; I have seen files before that do this. Interesting article about the Billion Laughs attack.
Robert Harvey
thanks for answering.
LES2
@Robert Harvey: Is there any DTD feature, that cannot be done in XSD?
dma_k
+1  A: 

There is also Relax NG — another powerful language for validating XML documents, along with Schematron and other technologies from DSDL. Relax NG is very simple and have human readable form — Relax NG Compact that allows scheme writing similar to BNF schemes.

Sergei Stolyarov
+2  A: 

A few years ago, there would be reasons to use DTD over XML Schema (it was more common or better supported by XML tools). Today, however, I see no reason to not use XML Schema instead of DTD : XML Schema is much more powerful.

However, XML Schema is far from being perfect (just try to read the spec or a book on XML Schema...) and many alternatives have been developed since then (Schematron, Examplotron, RelaxNG). These may have technical advantages over XML Schema, but XML Schema is so much more pervasive today that I see very few cases where an alternative would make sense.

Pascal Sartoretti
+1  A: 

XML Schema can perform more complex validations. For example if DTD can check if the datatype of an XML element is integer or string. Whereas XML schema can perform more complicated validations like if the xml element is a string starting with uppercase letter or a positve integer. Finally XML schema uses XML syntax and its a natural choice for development of web services.

Aruna
A: 

I was searching through the XML files on my PC's C: drive (5539 of them!!) representing a ton of different apps and other software and it's interesting that DTD seems to still have about 50% of the XML "mindshare".

By now we all know the pro's and con's of Schemas vs DTD but if you search the web you can find article from 8 years ago saying that DTD is obsolete and Schema's will totally own the future (i.e., today's past). I think we were all supposed to have green-energy flying cars by now, too.

I was really hoping this whole DTD vs XML Schemas thing was going to get cleanly resolved like BlueRay -vs- HD DVD. Oh well.

Peter Nelson
A: 

It is easy to define XML that can't be expressed in XML Schema:

<root> <tag1/> <tag2/> <tag1/> </root>

Go ahead and try to develop an xsd for that! It can't be done. The DTD, on the other hand, is simple.

XML Schema has too many limitations to be useful in the real world. DTDs have well-known weaknesses but, hey, at least you can get your job done.

Jim Beale