Subsections
An XML document that obeys the rules of the previous section
is described as well-formed.
It is also possible to
specify additional rules for the structure and content of an XML
document, via a schema for the document. If the document
is well-formed and also obeys the rules given in a schema, then
the document is described as valid.
The Document Type Definition language (DTD) is a language for describing
the schema for an XML document.
DTD code consists of element declarations and
attribute declarations.
An element declaration should be included for every different type
of element that will occur in an XML document.
Each declaration describes what content is allowed inside a
particular element. An element declaration is of the form:
<!ELEMENT elementName elementContents>
The elementContents can be one of the following:
- EMPTY
-
The element is empty.
- (#PCDATA)
-
The element may contain plain text.
- ANY
-
The element may contain anything (other elements, plain text,
or both).
- (childName)
-
The element must contain exactly one childName element.
- (childName*)
-
The element may contain zero or more childName elements.
- (childName+)
-
The element must contain one or more childName elements.
- (childName?)
-
The element must contain zero or one childName elements.
- (childName?)
-
The element must contain zero or one childName elements.
- (childA,childB)
-
The element must contain exactly one childA
element and exactly one childB element.
- (childA|childB)
-
The element must contain either exactly one childA
element or exactly one childB element.
- (#PCDATA|childA|childB)*
-
The element may contain zero or more occurences of plain text,
childA elements and childB elements.
An attribute declaration should be included for every different
type of element that can have attributes. The declaration
describes which attributes an element may have, what sort
of values the attribute may take, and whether the attribute is optional.
An attribute declaration is of the form:
<!ATTLIST elementName
attrName attrType attrDefault
...
>
The attrType controls what value the attribute can have.
It can have one of the following forms:
- CDATA
-
The attribute can take any value.
- ID
-
The value of this attribute must be unique for all elements of this
type in the document (i.e., a unique identifier). This is similar to
a primary key in a database table.
- IDREF
-
The value of this attribute must be the value of some other
element's ID attribute. This is similar to a foreign
key in a database table.
- (option1|option2)
-
This provides a list of the possible values for the attribute.
This is a good way to limit an attribute to only valid values
(e.g., only "male" or "female" for a gender
attribute).
The attrDefault either provides a default value for the attribute
or states whether the attribute is optional or required (i.e., must
be specified). It can have one of the following forms:
- value
-
This is the default value for the attribute.
- #IMPLIED
-
The attribute is optional. It is valid for elements of this
type to contain this attribute, but it is not required.
- #REQUIRED
-
The attribute is required so it must appear in all elements
of this type.
A DTD can be included directly within an XML document or the DTD
can be located within a separate file and just referred to from
the XML document.
The DTD information is included within a DOCTYPE definition
following the XML declaration. An inline DTD has the form:
<!DOCTYPE rootElementName [
... DTDcode ...
]>
An external DTD stored in a file called file.dtd
would be referred to as follows:
<!DOCTYPE rootElementName SYSTEM "file.dtd">
Paul Murrell

This document is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.