Tags and elements

One of the more common problems when people discuss markup, is that the terminology used by participants is somewhat confused.
The most common mistake people make, is that they confuse elements and tags. This confusion is even so common that the HTML 4.01 Specification contains a “note on elements and tags:”:http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.1
bq. *Elements are not tags.* Some people refer to elements as tags (e.g., “the P tag”). Remember that the element is one thing, and the tag (be it start or end tag) is another. For instance, the HEAD element is always present, even though both start and end HEAD tags may be missing in the markup.


h3. Terminology deconstructed
The HTML 4.01 Specification is saying that people tend to refer to:
bc. A fine example
as an _a tag_. This is of course, entirely wrong. Let’s re-examine:
h3. Tags
h4. The start tag
==

<a href="http://www.example.com/">A fine example</a>

==
The emphasized parts of the example above constitutes the _start tag_, which makes the start tag:
bc.
This means that any tag includes the brackets. The _name_ of the tag is simply a, without the brackets.
h4. SGML and start tags
Please note that in SGML, start tags may be optional. This means that the tag may be omitted from the literal markup. One such optional start tag, is the HEAD tag in HTML 4.01. This means that the following code is valid:
==

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<-- There is a head start tag here, it
is just not present in the document -->
<title>Example</title>
</head>
<body>
<p>An example</p>
</body>
</html>

==
Further, the closing bracket of a tag may, in certain cases, also be omitted. In the following example, the closing bracket of the HTML start tag is omitted:
==

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html
<title>Example</title>
</head>
<body>
<p>An example</p>
</body>
</html>

==
My “experiment with EvilML”:http://virtuelvis.com/archives/2004/02/evilml uses this.
h4. The end tag
The end tag consists of an opening bracket for the tag, the character /, the tag name, and a closing bracket:
==

<a href="http://www.example.com/">A fine example</a>

==
h4. SGML and end tags
In certain cases , as with the opening tags, the end tag is optional. This means that the following is valid SGML – notice the missing end tag for the paragraph:
==

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Example</title>
</head>
<body>
<p>An example paragraph
</body>
</html>

==
To further complicate matters, the closing bracket of an end tag may also in certain cases be omitted. “David Andersson, a.k.a. Liorean”:http://liorean.web-graphics.com/ has created a “minimal, valid HTML 4.01 document”:http://liorean.web-graphics.com/minimal.html that shows this in practice:
==

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<title/</><p/

==
h3. Attributes
An attribute always appears inside the start tag, after the tag name, before the terminating bracket and may be used to associate name-value pairs with an element:
==

<a href="http://example.com/">A fine example</a>

==
Means that the attribute named href and its value http://www.example.com is associated with the a element.
SGML allows for minimized attributes, where no explicit attribute value is visible in the document:
bc.
In XML this would have to be
bc.
h3. Elements
In plain terms, an element consists of a start tag, the element content, and an end tag:
bc.

This paragraph and
the start and end tags makes
a whole element

h4. Empty elements
When an element has no content, it is said to be _empty._ In SGML, you normally omit the end tag for the element:
bc.
In XML, an empty element normally uses an “empty-element tag”:http://www.w3.org/TR/REC-xml/#dt-eetag which is a tag that takes a special form:
bc.
There is the opening bracket for the start tag, followed by the tag name, and the tag is terminated with a slash and then the closing bracket. While the whitespace character before /> in the previous example is optional, it’s use is encouraged in the “HTML Compatibility guideline on empty elements.”:http://www.w3.org/TR/xhtml1/guidelines.html#C_2 However, be aware that in a HTML, XHTML, SGML, and XML context, “empty elements are insanely complicated”:http://www.cs.tut.fi/~jkorpela/html/empty.html.
Please note that while writing empty elements in an expanded form,

is perfectly legal, this is discouraged, use the empty-element tag instead:

h3. Conclusion
In brief: _Tags are not elements._ An element may actually be present in a document, even if there are no visible start or end tags for the element in the document.
Where I have mentioned SGML specifically, and failed to mention XML, it is because XML is simple: Don’t omit angle brackets, don’t omit start tags and don’t omit end tags (optionally: don’t omit empty-element tags).
*Disclaimer:* This is meant to be a friendly guide, this is not spec text. This is not meant to be spec text. Don’t treat it as such.

Next Post

2 Comments

  1. I’d just like to note that the ALT attribute on the IMG element is *not* a tag. Hence, «the alt tag» is utterly meaningles, since there is no ALT tag in any HTML specification. There isn’t an ALT element either, but there is an ALT attribute that belongs to the IMG element.

  2. Tags and elements

    Confused about HTML tags and elements? Surely you will be after reading this comprehensive explanation……