Character Entities

Character Entities End Tag: NA Support Key: 2 \| 3 \| 3.2 \| IE1 \| M1 \| N1	What is it? Attributes Tag Example	Parent/Content Model Tips & Tricks Browser Peculiarities
= Index DOT Html by Brian Wilson [bloo@blooberry.com] =

Main Index | Top Of Tree | Tag Index | Tag History

What is it?

A text character usually lives as an Octet, which is a single byte or 8 bits of data. Using 8 bits allows for 256 (a range from 0-255) possible distinct character codes. While the HTTP protocol allows the full 256 character range of the ISO 8859-1 characters to be transported, not all operating systems or applications may natively support this range. In order to increase portability/viewability of this character set, HTML offers alternative representations of all the ISO 8859-1 characters using coded Character Entities. These case-sensitive, coded representations are created using characters from a proper subset of the ISO 8859-1 (ISO Latin) character set known as ASCII. Character Entities represent a portable method for these characters to be displayed on any browser. A complete list of Character Entities can be found using the table below.

The ISO-8859-1 Character Set
	000-031 \| 032-064 \| 065-096 \| 097-126
	127-159 \| 160-191 \| 192-223 \| 224-255

Included in the Character Entity domain are both numbered and named entities:

Numbered Entity Syntax: &#charnumber;: Where charnumber is a distinct integer from 0-255.
Named Entity Syntax: &charname;: Where charname is a unique mnemonic shorthand of the character to be represented.

Note: Even though the trailing semi-colon character (';') is only necessary when the character following the entity would otherwise be recognized as part of the entity, it is wise to just always use this trailing termination character.

Attributes: Character Entities accept no attributes

Example: À = À

Parent Model: Pending
Content Model: Character Entities accept no content.

Tips & Tricks

Character entities can be used anywhere regular characters will be displayed on screen.
In cases like IMG or INPUT, they can be used only for final display purposes (ALT text for Images or VALUE for Input tags.)
Entities are not to be used in path names for URLs.
DTD Note: The " named character entity was retracted from the HTML 3.2 DTD. There is still some confusion as to WHY this was done, as this entity is in wide use, and exists in the HTML 2.0, 3.0 and Cougar DTDs. There are two differing stories as to why it was deleted from the 3.2 DTD:
1. Dan Connolly (co-author of HTML 2.0) has said the omission was a mistake.
2. Dave Raggett (author of HTML 3.0, 3.2 and Cougar) has said that the omission was intentional due to a disagreement in the HTML ERB over which entities should be in HTML 3.2. Only the basic set of entities was agreed upon. (Many thanks to a reader who sent me some mail clarifying this.)
Any documents using " will generate validation errors under the HTML 3.2 DTD, but it should be quite safe to leave these entities in legacy documents due to wide future and legacy DTD/browser support. The alternate form of this entity WILL validate and should be considered when authoring new documents: "

Browser Peculiarities

Nothing to report.