When defining what is content and presentation, see your HTML document as a data container. Then ask yourself the following on each element and attribute:
Does the attribute/element represent a meaningful entity in my data?
For example, are the words between <b>
tag are in bold simply for display purposes or did I want to add emphasis on that data?
Am I using the proper attribute/element to property represent the type of data I want to represent?
Since I want to add emphasis on that particular section, I should use <em>
(it doesn't mean italic, it means emphasis and can be made bold) or <strong>
depending of the level of emphasis wanted.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Sometimes an presentational tag can simply be replaced by CSS rules on the parent element. In which case, the presentational tag needs to be removed.
After asking yourself these three simple questions, you are usually able to make a pretty informed decision. An example:
Original Code:
<label for="name"><b>Name:</b></label>
Checking the <b>
tag...
Does the attribute/element represent a meaningful entity in my data?
No, the tag doesn't represent a data node. It is there purely for presentation.
Am I using the proper attribute/element to property represent the type of data I want to represent?
<b>
is used for presentation of bold elements.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Since <b>
is presentational and I am using it for presentation, yes. And since the <b>
element affects the whole of <label>
, it can be removed and style be applied to the <label>
.
Semantic HTML's goal is not to simplify design and redesign or to avoid inline styling, but to help a parser understand what that particular tag represent in your document. That way, applications can be created (ie.: search engine) to intelligently decide what your content signify and to classify it accordingly.
Therefore, it makes sense to use the CSS property content:
to add quotes around text located in a <q>
tag (it has no value to the data contained in your document other that presentation), but no sense to the use the same CSS property to add a © symbol in your footer as it does have a value in your data.
Same applies to attributes. Using the width
and height
attribute on an <img>
tag representing an icon at size 16x16 makes semantic sense as it is important to understand the meaning of the <img>
tag (an icon can have different representations depending on the size it is displayed at). Using the same attributes on an <img>
tag representing a thumbnail of an larger image does not.
Sometimes you will need to add non-semantic elements to be able to achieve your wanted presentation, but usually those are avoidable.
There are no wrong elements. There are wrong uses of particular elements. <b>
should not be used when adding emphasis. <small>
should be used for legal sub-text, not to make text smaller (see HTML5 - Section 4.6.4 for why), etc... All elements have a particular usage scenario and they all represent data (minus presentational elements, but they do have a use in some cases). No elements should be set aside.
Attributes are a different thing. Most the attributes are presentational in nature. Attributes such as <img border>
and <body fgcolor>
rarely have signification in the data you are representing therefore you should not use them (except in those rare cases).
Search Engines are a good examples as to why semantic documents are so important. Microformats are a predefined set of elements and classes which you can use to represent data which search engines will understand in a certain way. The product price information in Google Searches is an example of semantics at work.
By using the predefined rules in set standards to store information in your document allows third-party programs to understand what seems to be a wall of text without using heuristics algorithms which may be prone to failures. It also helps screen readers and other accessibility applications to more easily understand the context in which the information is presented. It also greatly helps the maintainability of your markup as everything is tied to a set definition.