So which one to start with, HTML or XHTML? I am a beginner and wants to have solid foundations of markup language but as I started learning I found some people use HTML and some XHTML.
XHTML is pretty much like HTML but non-sloppy. I really can't think of a reason besides laziness not to use it.
HTML 4.01 would be your best bet since learning in stages would allow you to see a clearer picture of whats really happening behind the scenes and deep within the markup. Once you have a clear view and lengthy understanding of the HTML 4.01 you can then move to XHTML 1.0.
XHTML is only useful if you want to autogenerate/manage/validate/etc the HTML code with help of a XML based tool, such as a component based MVC framework (e.g. Sun JSF, Apache Struts, Microsoft ASP.NET, etc) or with XSLT. Parsing/formatting HTML programmatically is trickier than XML, because HTML allows here and there non-closing tags, e.g. <br>
. XML is much easier to parse/format programmatically because it is required to be well-formed.
If you're just starting and/or hand-writing "plain vanilla" HTML, I would recommend to use HTML 4.01 elements with a HTML5 doctype. There's really no need to massage the HTML code into a XML format.
<!DOCTYPE html>
<html lang="en">
<head>
<title>Page title</title>
</head>
<body>
<h1>Heading</h1>
<p>Paragraph</p>
</body>
</html>
The HTML 5 elements aren't widely supported yet, hence the recommendation to stick with HTML 4.01 elements. The HTML 5 doctype triggers the standards mode in most of the browsers, including IE6. The other benefit of HTML5 is that it allows closing shorttags like in XHTML. Also see HTML5 spec chapter 3.2.2:
Authors may optionally choose to use this same syntax for void elements in the HTML syntax as well. Some authors also choose to include whitespace before the slash, however this is not necessary. (Using whitespace in that fashion is a convention inherited from the compatibility guidelines in XHTML 1.0, Appendix C.)
Basically, even if you write pure XHTML, using <!DOCTYPE html>
would still make it valid (and trigger webbrowsers in the correct standards mode).
Conventional wisdom has come sort of full circle on this point. Back in like 2002 everyone was gung-ho for XHTML but many people (including myself) didn't have good reasons why. It was just the cool new thing and everyone jumped on the bandwagon, started putting XHTML in their resume skills instead of just HTML which looked so plain and unimpressive.
What's happening now is, with HTML5 finished, people are starting to realize that there's nothing wrong with good old fashioned HTML. It's the language of the web. Here's the pros and cons of XHTML as I see them:
Pro
- Allows you to embed non-xhtml XML into your web page, such as an SVG element. This isn't possible with plain HTML.
- Allows you to easily parse your documents with an XML parser, which could obviate the need for hpricot or BeautifulSoup if say, you wanted to replace all H1 tags with H2 tags in your website templates.
Con
- IE doesn't understand the 'application/xhtml+xml' mime type, so as far as it's concerned you're sending malformed HTML.
- It's a little more verbose.
<br>
and<table cellspacing=0 cellpadding=0>
is neater looking, in my opinion, than<br />
and<table cellspacing="0" cellpadding="0">
.
There must be some advantages to XHTML that I'm missing, but I myself use HTML for everything these days.
When it comes to learning on or the other, there's really rather little between them. XHTML is essentially a subset of HTML that encourages (or rather requires) stricter standards -- specifically, it's an application of the XML standard to HTML. As such, any valid XHTML is also valid HTML (for the most part at least).
In my opinion, the distinction between XHTML and HTML isn't really that important. What is important, however, is to write consistent and efficient markup, and this is what the XHTML standard was designed to encourage. It doesn't matter whether you label you code as XHTML or HTML, just as long as it's well-written.
The main feature of XHTML is simply that it requires a high standard of quality in your code, but this is something you should be doing anyway in HTML.
XHTML is for a-type people who think XML looks more "neat" than plain-ol HTML.
But really, it doesn't matter that much. You can switch from one to another faster than it would take you to get some lunch.
HTML and XHTML are the same language, with slightly different syntaxes. Once you know one, you know the other.
It really doesn’t matter.
Start with HTML, but use a validator. In HTML5, everyone seems to be focusing on the HTML, rather than the XHTML serialisation.
- As I explain in my answer here, the designers of XML wanted to enforce higher coding standards and making parsing easier, but that only works if almost everybody switches. Instead of relying on your browser to enforce code quality, rely on validation.
- Due to limited XHTML support in Internet Explorer <=8, pretty much everybody serves XHTML as text/html. This effectively restricts you to a subset of HTML and XHTM and requires you to follow compatibility guidelines. You could choose what format to serve based on user agent instead, but this is messy.
Given the limited advantages, I strongly recommend HTML, especially if you are a beginner.