Entity References in XML
Certain characters have a special meaning and function within XML. For example the less than sign < and the greater than sign > mean the start and end of a tag. Machines or XML processors that interpret XML documents see these as instructions.
Sometimes we may want to use these special characters as part of our data to be displayed in our final document. For example, we may want to use the less than sign < in a sentence such as
<paragraph1> Some institutions offer only career and technical programs of < 2 years’ duration. </paragraph1>
In this case, the XML processor that is reading the XML file would return an error as it would read the less than sign < contained in the character data as markup that delimits the start of an instruction, however it would be missing the closing greater than sign >. Likewise, the apostrophe in year’s would also return an error as apostrophes delimit values in attributes.
In order to overcome this problem, these special characters are defined differently when they are to be included as character data for output in the final document. These especially defined characters are called entity references.
In XML there are 5 predefined entity references which are as follows.
|Character name||Character text||Entity reference|
The ampersand starts all general entity references and the semicolon ends them. Anywhere in the document that an entity reference appears, it is replaced by the character text.
From our example, the character data would be rewritten as follows;
<paragraph1> Some institutions offer only career and technical programs of < 2 years' duration.</paragraph1>
But the < would be replaced with <, the ' would be replaced with ‘ and would appear in the final document as;
Some institutions offer only career and technical programs of < 2 years’ duration.