tags:

views:

131

answers:

4
+3  Q: 

HTML nomenclature

Are there terms for HTML tags that differentiate between ones which should have closing tags, and ones which shouldn't?

For example, <em> and <a> should have accompanying </em> and </a> tags.

On the other hand, <br /> and <img ... /> shouldn't.

What is the first group called, and what is the second?

+5  A: 

In XHTML the document has to be well-formed. That includes that every element that has been opened with a start tag must have a corresponding end tag. Except it has no content (like other elements or text). Such elements are refered as empty elements.

And the latter ones are empty elements as they don’t have any content.

Gumbo
I'd clarify this to say that they don't define any child content.
Daniel Schaffer
See the definition of empty (http://www.w3.org/TR/xml/#dt-empty) and content (http://www.w3.org/TR/xml/#NT-content) in the XML 1.0 specification.
Gumbo
I know how they define it, but just because w3 says it that doesn't mean it's any less ambiguous or confusing ;) - what they mean by "empty" is that it doesn't have any child elements, not that they don't have content.
Daniel Schaffer
So by your definition of empty this `<foo>bar</foo>` would be empty as it does not have any child elements.
Gumbo
No, "bar" is a text node. So perhaps I was too specific by saying "element" - but my point remains that "empty" means it doesn't have any children - I'd argue that an image specifies content in the src attribute. Also, I don't think the w3 spec actually has any language about not having content - all it says is "empty elements" and an example.
Daniel Schaffer
“Definition: The text between the start-tag and end-tag is called the element's content” (http://www.w3.org/TR/xml/#dt-content); “Definition: An element with no content is said to be empty.” (http://www.w3.org/TR/xml/#dt-empty)
Gumbo
+7  A: 

I believe that <foo /> is an "empty element", as opposed to... not ;-p btw, <br /> isn't an html element - it is an xhtml element. IIRC it is supposed to be <br> in true legacy html.

Marc Gravell
good call on the distinction
David Berger
Oh snap, old-school HTML 1, Chris 0. :)
Chris
In fact, HTML doesn't have the <whatever /> style of tags at all. Some tags are always empty and thus don't need closing (like <img> and <br>), some tags can close automatically (like <li> and <p>) and other tags must always be explicitly closed. You can't do <div /> in HTML, only XHTML.
Chuck
"Empty element" only describes lack of content. Elements that cannot have content, like <br>, aren't empty. They are whitespace. Indeed, the W3C validator for HTML 4.01 gives a warning for <br />.
Cyberherbalist
Hey, check that out! The answer textbox requires you to escape your tags in order to see them, and the comment textbox does not. So I hosed my comment above because I didn't know that. Sorry!
Cyberherbalist
Actually, the answer textbox uses markdown which prefers it if you use the `backtick notation` for inline code, and a 4-space indent for code blocks.
Marc Gravell
@Cyberherbalist: The easiest way to escape HTML/XML content in an answer is by surrounding it with ` characters. Then you don't have to do < (and so on).
Eddie
+1  A: 

It should be noted that in HTML 4.01 some tags (for example, <p> and <li>) can have content, but the closing tag is optional.

The following is valid markup:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"&gt;
<html>
<head>
<title>Title</title>
</head>
<body>
<p>Test
</body>
</html>
Grant Wagner
+1  A: 

According to HTML 4.01, there are three different groups when it comes to elements and tags.

  1. Elements that must have a closing tags. E.g. <h1></h1>
  2. Elements that may have no closing tag. E.g. <li>
  3. Elements that may not have a closing tag. E.g. <br> - the W3C validator for HTML 4.01 warns on <br />

I looked high and low in the specification and could find no term used to describe tags in the third grouping.

The term "empty" only says that there currently isn't any content between the tags. This applies to 1 & 2 above.

My proposal: although W3C doesn't saying anything about it, as far as I could tell, it might be possible to refer to elements like <br> as "white space elements", since they are considered to be white space, and they are elements. "White space characters", such as  , are not elements, so there should be no confusion. Anyone see any problems with this? If not, maybe we should make a proposal to W3C.

Cyberherbalist
Kinda beat you to it... :-)
Cyberherbalist