Special Characters

There are several classes of characters which cannot be used "as is" in an HTML document. This page deals briefly with which characters these are, and how they can be used safely.

Reserved Characters

The less-than sign (<) and the greater-than sign (>) are used by HTML tags and cannot be used directly in the content of an HTML document. That is, a less-than sign indicates the beginning of a tag, and a greater-than sign indicates the end of a tag. By using such characters "as is", you will be telling the web-server to expect a tag where none exists. The result is often a page that does not display "properly".

For that reason we have escape sequences for these and other characters. All escape sequences begin with the & character and end with a semi-colon. In fact, for all characters a numerical value can be used instead of the character itself. For our purposes, there is no need to use the numerical representation when dealing with the normal characters used by English. For special characters such as < and > we have the easy-to-remember escape sequences &lt; (lt = less-than) and &gt; (gt = greater-than).

Because it is used in all escape sequences, the & character must be escaped as well; it is represented by the sequence &amp;.

International and Special Characters

Many characters are not present on the standard English keyboard. This includes many characters found in other European languages, such as German and French. Specifically this includes the umlauted letters of German, the German ß, and acute and grave accents. Other special characters are treated similarly and will not be dealt with here.

To produce these characters in an HTML document we use escape sequences. For these letters, the escape sequence usuall consists of & + letter + marking + ;. For example, to place an umlaut over a capital "A", the escape sequence is &Auml;. For a grave accent over an "e", it is &egrave;. It is not possible to use this method to produce accents over characters which do not usually take them. For example, an umlaut cannot be placed over a "p" with the sequence &puml;. The German "ß" is not the same as a Greek "beta", despite visual similarities. It is considered to be an "s" and "z" written close together; that is, it is an s-z-ligature, and its escape sequence is &szlig;. Similarly, we can create Æ.


Chapter 2



This site © copyright 1999, Steve Krause, all rights reserved.