<html>
<head>
<title>Processing Instructions</title>
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div id="Description">
<table cellpadding="0" cellspacing="0" border="0" width="100%" class="main">
<tr>
<td valign="top" class="NAME">Processing Instructions</td>
<td valign="top" class="COMPATIBILITY">&nbsp;</td>
</tr>
<tr>
<td colspan="2" class="divider"><img src="dwres:18084" width="100%" height="1"></td>
</tr>
<tr>
<td valign="top" colspan="2" class="description">
<span class="PROGRAMLISTING"><pre><strong class="userinput">&lt;?</strong><var class="replaceable">target</var> <var class="replaceable">processing-instruction data</var><strong class="userinput">?&gt;</strong></pre></span>
<p>Processing instructions provide an escape
mechanism that allows an XML application to include instructions to
an XML processor that are not validated. The processing instruction
target can be any legal XML name, except <span class="LITERAL">xml</span> in
any combination of upper- and lowercase (see <link linkend="xmlnut3-CHP-2">Chapter 2</link>). Linking to a stylesheet to provide
formatting instructions for a document is a common use of this
mechanism. According to the principles of XML, formatting
instructions should remain separate from the actual content of a
document, but some mechanism must associate the two. Processing
instructions are significant only to applications that recognize
them.</p>

<p>The notation facility can indicate exactly what type of processing
instruction is included, and each individual XML application must
decide what to do with the additional data. No action is required by
an XML parser when it recognizes that a particular processing
instruction matches a declared notation. When this facility is used,
applications that do not recognize the public or system identifiers
of a given processing instruction target should realize that they
could not properly interpret its data portion.</p>

<sidebar id="xmlnut3-CHP-21-SIDEBAR-1">
<p class="TITLE">Character Encoding Autodetection</p>

<p>The XML declaration (possibly preceded by a Unicode
byte-order mark) must be the very first item in a document so that
the XML parser can determine which character encoding was used to
store the document. A chicken-and-egg problem exists, involving the
XML declaration's <span class="LITERAL">encoding="...</span>"
clause: the parser can't parse the clause if it
doesn't know what character encoding the document
uses. However, since the first five characters of the document must
be the string <span class="LITERAL">&lt;?xml</span> (if it includes an XML
declaration), the parser can read the first few bytes of a document
and, in most cases, determine the character encoding before it has
read the <span class="LITERAL">encoding</span> declaration.</p>
</sidebar>
</td></tr>
</table>
</div>
</body>
</html>