As I explained at the end of the last article. You create a Document Type to define the rules that must be followed in your XML Markup. A DTD defines what elements are required and what attributes can be set and what their potential values can be.
Every DTD must start with the line <?xml version=”1.0” encoding=”UTF-8” standalone=”no”?>. This tells the processor that this is an XML version 1.0 file. The character encoding used is of type UTF-8. And, that no external file is required for this document to work.
Also make sure that when you are creating DTD’s that !ELEMENT, !ATTLIST, #REQUIRED, #PCDATA, #CDATA, etc. are all capitalized.
There you are all up to speed. Now on to DTD Prologs…
DTD Prolog
In the DTD Prolog you describe your DTD in the following ways:
Note: If you don’t require any outside files standalone would have the value of “yes”. Also, you can refer to the location of a needed file like I did in the last example only if it resides in the same file folder. You could also replace the file name “customerData.dtd”, with a URL.
Elements
Everything that lies between two tags on a HTML page is referred to as an element. So, <p>Paragraph of text…</p> is an element. You can use DTD to define a great deal of rules that must be followed for all the elements on the page.
!ELEMENT is used to define what data can be placed in an element. It also defines how many times that data can be placed and in what order.
Here is a sample !ELEMENT definition:
<!ELEMENT customer (customerID, customerName, suffix?, products+, visits*)>
Here are the ways you can define what types of data can be placed in an element:
Special Note:
#PCDATA is known as parsed character data. This is character data that contains codes that the xml interpreter must decode. Parsed character data would contain the following codes & < and > instead of &, <, or >.
#CDATA on the other hand is just straight text that contains no character codes.
Attributes
You define what attributes can be assigned to your elements with !ATTLIST. You also use it to define the data type and any default values with it. The standard format is:
<!ATTLIST elementName attributeName dataType defaultValue>
A real world example might look like this
<!ATTLIST customer firstName CDATA #REQUIRED>
Attribute Data Types
The first to elements used in the above !ATTLIST definition are self explanatory so I’ll define the data types available:
You also can use an enumerated list of values instead of a data type. An Enumerated List is an all inclusive list of every possible value. Here is an example:
<!ATTLIST customer suffix (BA | MA | Ph.D.) #IMPLIED>
Finally, you define the default value for the attribute. All of the possible values for default value are:
Entities
With an entity you can declare a type of variable name that would represent a block of text. Here is an example of an entity:
<!ENTITY bizAddress “123 Main St, Irwin, PA 15147”>
With this defined I can now place an address any where by just using the code &bizAddress;. This is known as an Internal Entity, if it’s defined in the DTD that is referencing it.
An External Entity is defined in an included file. You would define it with this sample format:
<!ENTITY entityName SYSTEM “urlOfData”>
One of the great things about External Entities is that they can reference images, and other none XML data.
Just so I’m completely clear, you reference the Entity by typing a &, followed by the Entity name and then a semi-colon (;). Also you cannot make a call to an entity until you have already defined it in your code.
External Entities are either Parsed or Unparsed:
Please note that Unparsed External Entities are defined differently, so that they are passed on to the proper helper. You could define an image with this code:
<!ENTITY imageName SYSTEM “urlOfImage” NDATA jpeg>
Notation
The Notation Element describes the format of non-XML data within an XML document. The basic format of a Notation follows:
<!NOTATION notationName SYSTEM typeContent>
or for the image I was talking about above
<!NOTATION jpeg SYSTEM “image/jpeg”>
That’s All Folks
That is pretty much all there is to know about Document Type Definition’s (DTD). If you have any questions leave them below. Next up I’ll talk about Schema’s.
Till Next Time
– Think Tank
Leave a Reply