Extensible HyperText Markup Language

Serving XHTML As HTML

Writing backward-compatible XHTML

HTML is a good markup language that has served the Web well. At a certain stage however, innovations planned for the Web required more of HTML than it could give. Thus, in 2000, XML rules were added to HTML, to beef it up. The result, XHTML 1.0, retains almost all of HTML 4.01. For example, from a markup perspective, all the elements and attributes remain the same. This continuity from HTML into XHTML is key, because it means that, when written correctly, XHTML is backward-compatible. So XHTML can be served as HTML to Web browsers.

Below are rules for authoring XHTML that will be compatible with current and future Web browsers.

Element And Attribute Names Must Be Written In Lowercase

XHTML is an XML language, so it follows the XML rule in being case-sensitive. Element and attribute names must be written in lowercase, as seen in the following example of valid XHTML:

  1. <p><img src="smith.jpg" alt="Headshot of James Smith" /></p>

This is in contrast to the same content, written in valid HTML:

  1. <P><IMG Src="smith.jpg" Alt="Headshot of James Smith" /></p>

Predefined Attribute Values Must Be Written In Lowercase

Some attributes have predefined values. For example, the input element has a type attribute that expects a predefined value such as text, password, checkbox, radio, submit, image, reset, button, hidden or file. These predefined values should be written in lowercase, as in this example:

  1. <p><input type="text" name="city" /></p>

Closing Elements Are Required For Non-empty Elements

Non-empty elements are elements that can contain text or other elements. In HTML, it was permitted to not close some elements that contained text or other elements. For example:

  1. <p>This is paragraph one.
  2. <p>This is paragraph two.

In XHTML, all non-empty elements must have closing elements. The previous example, re-written as XHTML, would look like this:

  1. <p>This is paragraph one.</p>
  2. <p>This is paragraph two.</p>

Correct Nesting Of Elements

Elements that contain other elements must be nested correctly. In the following example, the parent element (p) is incorrectly closed before the child element (em) is closed:

  1. <p>The quick brown fox jumps over the <em>lazy dog.</p></em>

This is the correct way to write the same content:

  1. <p>The quick brown fox jumps over the <em>lazy dog</em>.</p>

Attribute Values Must Be In Quotes

In HTML, it was permitted to omit quotes around attributes. For example:

  1. <table width=100%>

In XHTML, quotes around attribute values are required. For example:

  1. <table width="100%">

Markup Characters Used In Text Or Attribute Values Must Be Escaped

Markup characters are characters that are used to delimit elements, attributes and special character references. The four markup characters are: <, >, & and ". When these characters are used in text or attribute values, they must be escaped as follows:

  • < becomes &lt;
  • > becomes &gt;
  • & becomes &amp;
  • " becomes &quot;

For example, the following URL contains a markup character &:

  1. http://xhtml.com/en/?css=no&layout=yes

When writing this URL in XHTML, the & must be escaped:

  1. <p><a href="http://xhtml.com/en/?css=no&amp;layout=yes">Turn off CSS</a></p>

Use Numeric Character References Instead Of Entities

Numeric Character References (NCR) offer a better way to escape characters. For example, the entity &euro; representing the € symbol is equivalent to the NCR &#8364;.

Add Space Before Closing Empty Elements

Empty element cannot contain text or other elements. They include area, base, br, col, hr, img, input, link, meta and param. When using empty elements, always include a space before the trailing / and >. For example, instead of writing <br/>, write <br />.

Minimize Syntax For Empty Elements

Minimize syntax for empty elements. For example, instead of writing <br></br>, empty elements should be written as a single closed element such as <br />.

Write Boolean Attributes In Full

Some attributes represent boolean values. The presence of these attributes implies that the value of the attribute is "true". Their absence implies a value of "false". For example, in HTML, if the option element contained the attribute selected, this represented that the option was set to "true":

  1. <select name="country">
  2. <option value="1" selected>Italy</option>
  3. <option value="2">France</option>
  4. </select>

In HTML, boolean attributes could be written in a minimized form, with just the value of the attribute, or in full form, with the name and the value of the attribute. In XHTML, the full form must be used where the name of the attribute is the same as the value of the attribute. For example:

  1. <select name="country">
  2. <option value="1" selected="selected">Italy</option>
  3. <option value="2">France</option>
  4. </select>

Embedded JavaScript And CSS

If script or style elements contain the markup characters <, >, & or ", the contents of these elements needs to be placed inside a CDATA section. In addition, for compatibility reasons, put // comments before the opening and closing CDATA markup. For example:

  1. <script type="text/javascript">
  2. //<![CDATA[
  3. alert('Tom & I want to say "10 > 9"');
  4. //]]>
  5. </script>

The alternative is to use external script and CSS documents.

White Space Characters In Attributes

When XHTML is processed as XML, leading and trailing white space and line breaks in attributes are removed. Therefore, to get consistent behavior regardless of how XHTML is processed, avoid using leading / trailing white space and line breaks in attributes.

Serving backward-compatible XHTML

Once XHTML is written correctly, it must be served to Web browsers as HTML in order to be backward-compatible. To instruct browsers to treat XHTML as HTML (versus XML), use media types.

If a the media type for an XHTML Web page is given as text/html, the Web browser will parse the Web page as though it were HTML. If the media type is given as application/xhtml+xml, the browser will parse the page as XML.

Since Web servers and server-side scripting environments (PHP, ASP, etc.) use the text/html media type for Web content, you do not need to do anything special in order to server XHTML as HTML. In other words, by default, your XHTML Web pages will be served as HTML.