views:

146

answers:

3

I've been considering converting my current HTML5 documents to polyglot HTML5 ones. I figure that even if they only ever get served as text/html, the extra checks of writing it XML would help to keep my coding habits tidy and valid.

Is there anything particularly thrilling in the HTML5-only space that would make this an unwise choice?

Secondly, the specs are a bit hazy on how to validate a polyglot document. I assume the basics are:

  1. No errors when run through the W3C Validator as HTML5
  2. No errors when run through an XML parser

But are there any other rules I'm missing?

Thirdly, seeing as it is a polyglot, does anyone know any caveats to serving it as application/xhtml+xml to supporting browsers and text/html to non-supporting ones?

Edit: After a small bit of experimenting I found that entities like   break in XHTML5 (no DTD). That XML parser is a bit of a double-edged sword, I guess I've answered my third question already.

A: 

This sounds like a very difficult thing to do. One of the downfalls of XHTML was that it wasn't possible to steer successfully between the competing demands of XML and vintage HTML.

I think if you write HTML5 and validate it successfully, you will have as tidy and valid a document as anyone would need.

Ned Batchelder
+1  A: 

Given that the W3C's documentation on the differences between HTML and XHTML isn't even finished, it's probably not worth your time to try to do polyglot. Not yet anyways.... give it another couple of years.

In any event, only in the extremely narrow circumstances where you are actively planning on parsing your HTML as XML for some specific purpose, should you invest the extra time in XML-compliance. There are no benefits of doing it purely for consumption by web browsers -- only drawbacks.

Warren
+1  A: 

Work on defining how to create HTML5 polyglot documents is currently on-going, but see http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html for an early draft. It's certainly possible to do, but it does require a good deal of coding discipline, and you will need to decide whether it's worth the effort. Although I create HTML4.01/XHTML1.0 polyglot documents, I create them using an XML tool chain which guarantees XML well-formedness and have specialized code to ensure compatibility with HTML non-void elements and valid XML characters. Direct hand coding would be very difficult.

One known current issue in HTML5 is the srcdoc attribute on the iframe element. Because the value of the attribute contains markup, certain characters need to be escaped. The HTML5 draft spec describes how to do this for the HTML serialization, but not (the last time I looked) how to do it in the XHTML serialization.

Alohci
Thanks for the guide!I've never liked iframes. They always seemed like a "Yo dawg, I heard you like web pages, so I put a web page in your web page so you can surf while you surf".
Tim