Main Index |
Top Of Tree |
Tag Index |
Tag History
- What is it?
- A text character usually lives as an Octet, which is a single byte
or 8 bits of data. Using 8 bits allows for 256 (a range from 0-255)
possible distinct character codes. While the HTTP protocol allows the full
256 character range of the ISO 8859-1 characters to be transported, not
all operating systems or applications may natively support this range. In
order to increase portability/viewability of this character set, HTML
offers alternative representations of all the ISO 8859-1 characters using
coded Character Entities. These case-sensitive, coded representations are
created using characters from a proper subset of the ISO 8859-1 (ISO Latin)
character set known as ASCII. Character
Entities represent a portable method for these characters to be displayed
on any browser. A complete list of Character Entities can be found using
the table below.
Included in the Character Entity domain are both numbered and named
entities:
- Numbered Entity Syntax:
&#charnumber;
- Where charnumber is a distinct integer
from 0-255.
- Named Entity Syntax:
&charname;
- Where charname is a unique mnemonic shorthand of
the character to be represented.
Note: Even though the trailing semi-colon
character (';') is only necessary when the character following the entity
would otherwise be recognized as part of the entity, it is wise to just
always use this trailing termination character.
- Attributes
- Character Entities accept no
attributes
- Example
- À = À
- Parent Model
- Pending
- Content Model
- Character Entities accept no content.
Tips & Tricks
- Character entities can be used anywhere regular characters will be
displayed on screen.
- In cases like IMG or INPUT, they can be used only for final display
purposes (ALT text for Images or VALUE for Input tags.)
- Entities are not to be used in path names for URLs.
- DTD Note: The " named character
entity was retracted from the HTML 3.2 DTD. There is still some confusion
as to WHY this was done, as this entity is in wide use, and exists in the
HTML 2.0, 3.0 and Cougar DTDs. There are two differing stories as to why
it was deleted from the 3.2 DTD:
- Dan Connolly (co-author of HTML 2.0) has said the omission was a mistake.
- Dave Raggett (author of HTML 3.0, 3.2 and Cougar) has said that the
omission was intentional due to a disagreement in the HTML ERB over
which entities should be in HTML 3.2. Only the basic set of entities was
agreed upon. (Many thanks to a reader who sent me some mail clarifying this.)
Any documents using " will generate validation errors under the HTML
3.2 DTD, but it should be quite safe to leave these entities in legacy documents
due to wide future and legacy DTD/browser support. The alternate form of this
entity WILL validate and should be considered when authoring new documents: "
Browser Peculiarities
Boring Copyright Stuff...