8.1 XML syntax

The first line of an XML document should be a declaration that this is an XML document, including the version of XML being used.

<?xml version="1.0"?>

An XML document consists entirely of XML elements. An element usually consists of a start tag, an end tag, with plain text content or other XML elements in between.

A start tag is of the form <elementName> and an end tag has the form </elementName>.

The start tag may include attributes of the form attrName="attrValue". The attribute value must be enclosed within double-quotes.

The names of XML elements and XML attributes are case-sensitive.

It is also possible to have an empty element, which consists of a single tag, with attributes. In this case, the tag has the form <elementName />.

The following code shows examples of XML elements. The second example is an empty element with two attributes.

<filename>ISCCPMonthly_avg.nc</filename>

<case date="16-JAN-1994" temperature="278.9" />

XML elements may have other XML elements as their content. An XML element must have a single root element, which contains all other XML elements in the document.

A comment in XML is anything between the delimiters <!-- and -->.

For the benefit of human readers, the contents of an XML element are usually indented. However, white space is preserved within XML so this is not always possible when including plain text content.

In XML code, certain characters, such as the greater-than and less-than signs, have special meanings. Escape sequences, such as &gt; and &lt;, must be used to obtain the corresponding literal character within plain text content. A special syntax is provided for escaping an entire section of plain text content for the case where many such special characters are included. Any text between the delimiters <![CDATA[ and ]]> is treated as literal.

Paul Murrell

Creative Commons License
This document is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.