tags:

views:

500

answers:

8

So which one to start with, HTML or XHTML? I am a beginner and wants to have solid foundations of markup language but as I started learning I found some people use HTML and some XHTML.

+13  A: 

XHTML is pretty much like HTML but non-sloppy. I really can't think of a reason besides laziness not to use it.

David Hedlund
Someone does not agree: http://hixie.ch/advocacy/xhtml
ntd
Most if not all highly seasoned developers would not agree with you, novice types, well yes they'd surely agree.
drlouie - louierd
Here's a reason why you shouldn't really use XHTML: http://stackoverflow.com/questions/1770193/ies-xhtml-compatibility
drlouie - louierd
@ntd, he just argues that sending xhtml as text/html may be a bad thing, it doesn't actually say -using- xhtml is bad.
Daniel
Thank you for advice
Zai
@Daniel: actually that's exactly what he says. He explains that XHTML as text/html is bad and IE6 doesn't support XHTML+HTML. You lose several conveniences with XHTML (like HTML entities; XHTML entities are different). There is very little reason to use XHTML. So much so that XHTML 2 died and XHTML 5 exists only for compatibility reasons for those already using XHTML.
cletus
HTML is a well-defined language just like XHTML. Nothing sloppy about it. Either one can be implemented with mistakes and become invalid "tag soup."
darkporter
@cletus not just IE6. No version of ie supports it.
Alexandre Jasmin
HTML5. There's a reason right there.
roryf
+1  A: 

HTML 4.01 would be your best bet since learning in stages would allow you to see a clearer picture of whats really happening behind the scenes and deep within the markup. Once you have a clear view and lengthy understanding of the HTML 4.01 you can then move to XHTML 1.0.

drlouie - louierd
So they should learn the poor habits that HTML allows then move to the better standard?
ChaosPandion
Think of it anyway you'd like, the reality of the matter is if you don't know where we are coming from you wont be well suited to know where we are going.
drlouie - louierd
What "poor habits" does HTML allow?
darkporter
HTML 4 doesn't teach poor habits. It teaches the markup that has the widest possible support. XHTML is a self-indulgent and pointless distraction 99% of the time.
cletus
@mastermind: can you explain why HTML 4.01 is better for understanding what's happening behind the scenes? I think most people would picture the tree-structured DOM better as XML than as HTML where some tags are often omitted. To a beginner, the determination of which tags can be omitted seems very arbitrary.
Alohci
The most important thing is to use validation, even as a beginner
Casebash
+10  A: 

XHTML is only useful if you want to autogenerate/manage/validate/etc the HTML code with help of a XML based tool, such as a component based MVC framework (e.g. Sun JSF, Apache Struts, Microsoft ASP.NET, etc) or with XSLT. Parsing/formatting HTML programmatically is trickier than XML, because HTML allows here and there non-closing tags, e.g. <br>. XML is much easier to parse/format programmatically because it is required to be well-formed.

If you're just starting and/or hand-writing "plain vanilla" HTML, I would recommend to use HTML 4.01 elements with a HTML5 doctype. There's really no need to massage the HTML code into a XML format.

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>Page title</title>
    </head>
    <body>
        <h1>Heading</h1>
        <p>Paragraph</p>
    </body>
</html>

The HTML 5 elements aren't widely supported yet, hence the recommendation to stick with HTML 4.01 elements. The HTML 5 doctype triggers the standards mode in most of the browsers, including IE6. The other benefit of HTML5 is that it allows closing shorttags like in XHTML. Also see HTML5 spec chapter 3.2.2:

Authors may optionally choose to use this same syntax for void elements in the HTML syntax as well. Some authors also choose to include whitespace before the slash, however this is not necessary. (Using whitespace in that fashion is a convention inherited from the compatibility guidelines in XHTML 1.0, Appendix C.)

Basically, even if you write pure XHTML, using <!DOCTYPE html> would still make it valid (and trigger webbrowsers in the correct standards mode).

BalusC
Here's a reason why you shouldn't really use HTML 5: http://stackoverflow.com/questions/413114/html-vs-xhtml-does-it-still-matter
drlouie - louierd
D'oh. I didn't said "HTML 5 elements", I just said "HTML 5 doctype".
BalusC
Nice, your answer compliments my answer. "XML based tool" was the part I was missing.
darkporter
Using an HTML5 doctype excludes the use of a mature validator that tests against a stable specification.
David Dorward
D'oh, I didn't said "HTML 5 elements", I just said "HTML 4.01 elements".
BalusC
+13  A: 

Conventional wisdom has come sort of full circle on this point. Back in like 2002 everyone was gung-ho for XHTML but many people (including myself) didn't have good reasons why. It was just the cool new thing and everyone jumped on the bandwagon, started putting XHTML in their resume skills instead of just HTML which looked so plain and unimpressive.

What's happening now is, with HTML5 finished, people are starting to realize that there's nothing wrong with good old fashioned HTML. It's the language of the web. Here's the pros and cons of XHTML as I see them:

Pro

  • Allows you to embed non-xhtml XML into your web page, such as an SVG element. This isn't possible with plain HTML.
  • Allows you to easily parse your documents with an XML parser, which could obviate the need for hpricot or BeautifulSoup if say, you wanted to replace all H1 tags with H2 tags in your website templates.

Con

  • IE doesn't understand the 'application/xhtml+xml' mime type, so as far as it's concerned you're sending malformed HTML.
  • It's a little more verbose. <br> and <table cellspacing=0 cellpadding=0> is neater looking, in my opinion, than <br /> and <table cellspacing="0" cellpadding="0">.

There must be some advantages to XHTML that I'm missing, but I myself use HTML for everything these days.

darkporter
I don't know about that. SGML requires quotes, but I always thought HTML didn't. http://dev.w3.org/html5/markup/syntax.html#syntax-attributes
darkporter
Yes. cletus is wrong. In general, using attributes without quotes is valid HTML. As it happens, I think quoting attributes is neater, but it's very much a personal preference and other reasonable people may disagree.
Alohci
@Alohci: Doesn't strict HTML require quotes which means it is more a matter of future compatibility than personal preference?
Casebash
@Casebash - No. HTML 4 didn't require it, and neither does HTML 5 in the text/html syntax. Easy to check. Copy <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"><title>test</title><p class=smith>Test</p> into http://validator.w3.org/#validate_by_input and run. Even if a future HTML did make it a conformance requirement, HTML parsers (e.g. in browsers) would still need to support omitted quotes to be able to process huge swathes of the web. There is no future compatibility issue. XHTML served correctly as application/xhtml+xml does require them, of course.
Alohci
+2  A: 

When it comes to learning on or the other, there's really rather little between them. XHTML is essentially a subset of HTML that encourages (or rather requires) stricter standards -- specifically, it's an application of the XML standard to HTML. As such, any valid XHTML is also valid HTML (for the most part at least).

In my opinion, the distinction between XHTML and HTML isn't really that important. What is important, however, is to write consistent and efficient markup, and this is what the XHTML standard was designed to encourage. It doesn't matter whether you label you code as XHTML or HTML, just as long as it's well-written.

The main feature of XHTML is simply that it requires a high standard of quality in your code, but this is something you should be doing anyway in HTML.

Will Vousden
"As such, any valid XHTML is also valid HTML". No it isn't. Prior to HTML5, it is impossible to create a single document that is both valid XHTML and valid HTML. XHTML must include a namespace declaration attribute, and that would be an invalid attribute in HTML.
Alohci
Hence "for the most part at least". For practical purposes (and certainly for a beginner), it's still reasonable to say that XHTML is a subset of HTML.
Will Vousden
Sorry, I still don't agree. A construct like <div /> will do one thing in XHTML and something different when treated as HTML. To describe XHTML as a subset of HTML is to deny that, and thereby cause confusion in beginners. I prefer to describe HTML and XHTML as sibling languages, with a common vocabulary but different syntaxes.
Alohci
+3  A: 

XHTML is for a-type people who think XML looks more "neat" than plain-ol HTML.

But really, it doesn't matter that much. You can switch from one to another faster than it would take you to get some lunch.

Brian Ortiz
For small sites yes, for large, it isn't so easy
Casebash
A: 

HTML and XHTML are the same language, with slightly different syntaxes. Once you know one, you know the other.

It really doesn’t matter.

Paul D. Waite
It isn't super important to get this right for a beginner, but even so going with HTML will be easier for a beginner
Casebash
“HTML will be easier for a beginner” — how?
Paul D. Waite
@Paul: Dual serving based on the accept header is messy. XHTML served as HTML requires you to follow [compatibility guidelines](http://www.w3.org/TR/xhtml-media-types/#compatGuidelines). Either option is quite complex
Casebash
@Casebash: I guess the compatibility guidelines would be confusing if you’re coming from an XML background. If not, I don’t think you’d even notice them.
Paul D. Waite
@Paul: Are you saying that we should teach XHTML without teaching XML?
Casebash
@Casebach: Depends really. If you’re actually going to be using XHTML as XML at any point, then learning XML would help. But if you’re just writing XHTML for the web, then you’re not actually writing XML, so I reckon you could quite happily skip XML knowledge. And really, what is there to XML itself (i.e. ignoring any specific XML languages)? It’s just a set of syntax rules, right?
Paul D. Waite
A: 

Start with HTML, but use a validator. In HTML5, everyone seems to be focusing on the HTML, rather than the XHTML serialisation.

  • As I explain in my answer here, the designers of XML wanted to enforce higher coding standards and making parsing easier, but that only works if almost everybody switches. Instead of relying on your browser to enforce code quality, rely on validation.
  • Due to limited XHTML support in Internet Explorer <=8, pretty much everybody serves XHTML as text/html. This effectively restricts you to a subset of HTML and XHTM and requires you to follow compatibility guidelines. You could choose what format to serve based on user agent instead, but this is messy.

Given the limited advantages, I strongly recommend HTML, especially if you are a beginner.

Casebash