The final piece of the DTD puzzle involves attributes. You know attributes: they are the name/value pairs included with tags in your documents that control the behavior and appearance of those tags. To define attributes and their allowed values within an XML DTD, use the <!ATTLIST> directive:
<!ATTLIST element attributes>
The element is the name of the element to which the attributes apply. The attributes are a list of attribute declarations for the element. Each attribute declaration in this list consists of an attribute name, its type, and its default value, if any.
Attribute values can be of several types, each denoted in an attribute definition with one of the following keywords.
CDATA indicates that the attribute value is a character or string of characters. This is the attribute type you would use to specify URLs or other arbitrary user data. For example, the src attribute of the <img> tag in HTML has a value of CDATA.
ID indicates the attribute value is a unique identifier within the scope of the document. This attribute type is used with an attribute, such as the HTML id attribute, whose value defines an ID within the document, as discussed in Appendix B, "HTML/XHTML Tag Quick Reference", Section B.1, "Core Attributes".
IDREF or IDREFS indicates that the attribute accepts an ID defined elsewhere in the document via an attribute of type ID. The ID type is used when defining IDs; the IDREF and IDREFS are used when referencing a single ID and a list of IDs, respectively.
ENTITY or ENTITIES indicates that the attribute accepts the name or list of names of unparsed general entities defined elsewhere in the DTD. The definition and use of unparsed general entities is covered in Section 15.3.2, "Entities".
NMTOKEN or NMTOKENS indicates that the attribute accepts a valid XML name or list of names. These names are given to the processing application as the value of the attribute. How they are used is determined by the application.
In addition to these keyword-based types, you can create an enumerated type by listing the specific values allowed with this attribute. To create an enumerated type, list the allowed values, separated by vertical bars and enclosed in parentheses, as the type of the attribute. For example, here is how the method attribute for the <form> tag is defined in the HTML DTD:
method (get|post) "get"
The method attribute accepts one of two values, either get or post; get is the default value if nothing is specified in the document tag.
After you define the name and type of an attribute, you must specify how the XML processor should handle default or required values for the attribute. You do this by supplying one of four values after the attribute type.
If you use the #REQUIRED keyword, the associated attribute must always be provided when the element is used in a document. Within the XHTML DTD, the src attribute of the <img> tag is required, since an image tag makes no sense without an image to display.
The #IMPLIED keyword means that the attribute may be used but is not required and that no default value is associated with the attribute. If it is not supplied by the document author, the attribute will have no value when the XML processor handles the element. For the <img> tag, the width and height attributes are implied, since the browser will derive sizing information from the image itself if these attributes are not specified.
If you specify a value, it then becomes the default value for that attribute. If a value for the attribute is not specified by the user, the XML processor will insert the default value (the value specified in the DTD).
If you precede the default value with the keyword #FIXED, the value is not only the default value for the attribute, it is the only value that can be used with that attribute, if it is specified.
For example, examine the attribute list for the form element, taken (and abridged) from the HTML DTD:
<!ATTLIST form action CDATA #REQUIRED method (get|post) "get" enctype CDATA "application/x-www-form-urlencoded" onsubmit CDATA #IMPLIED onreset CDATA #IMPLIED accept CDATA #IMPLIED accept-charset CDATA #IMPLIED >
This example associates seven attributes with the form element. The action attribute is required and accepts a character string value. The method attribute has one of two values, either get or post. get is the default, so if the document author doesn't include the method attribute in the form tag, method=get will be assumed automatically by the XML parser.
The enctype attribute for the form element accepts a character string value and if not specified, defaults to a value of application/x-www-form-urlencoded. The remaining attributes all accept character strings, are not required, and have no default value if they are not specified.
If you look at the attribute list for the <form> element in the HTML DTD, you'll see that it does not exactly match our example. That's because we've modified our example to show the types of the attributes after any parameter entities have been expanded. In the actual HTML DTD, the attribute types are provided as parameter entities whose names give a hint of kind of values expected by the attribute. For example, the type of the action attribute is %URI;, which elsewhere in the DTD is defined to be CDATA. By using this style, the DTD author lets you know that the string value for this attribute should be a URL, not just any old string. Similarly, the type of the onsubmit and onreset attributes is given as %Script. This is a hint that the character string value should name a script to be executed when the form is submitted or reset.
Copyright © 2002 O'Reilly & Associates. All rights reserved.