In this article I’ll describe the most complicated aspect in XML. A XML Schema is another way you can represent how a XML document file defines data, elements and attributes.
How Schema’s Differ from DTD’s
Unlike DTD’s, which are great when you are mainly working with text elements, Schema’s work best when your mainly working with data elements.
Schema’s also work better when you want to be certain that the proper data is entered. Schema’s define what type of data each element and attribute can hold. For example, with a XML Schema you can specify you will only allow a price element to be of the form of a certain number of numbers followed by a period and then two additional digits.
Also Schema’s must be located in a completely separate file outside of the XML, unlike DTD’s. In that separate file it defines the elements and attributes work together to define the content.
The 47 Schema Data Types
Schema’s use their own defined data types, which you can use as a base to create even more complicated data types. Their currently are 47 data types available at this time. I’ll list them here, but will only cover the most commonly used in detail:
Most Commonly used Data Types
Here is a decription for the most commonly used XML Schema Data Types:
The Parts of a Schema
Here are the different parts of a Schema:
The XML Declaration
What you are doing with these two lines of code is:
<xml version=”1.0” encoding=”UTF-8”>
<xsd:schema xmlns: xsd=”http://www.w3.org2001/XMLSchema”/>
saying that this document uses XML Schema. XSD stands for XML Schema Definitions.
The Rest of the Schema
Now you define the Elements and Attributes. You specify how these things work together, what elements contain other elements and the attributes of each element.
Elements
You must define all of your elements with an Element Declaration. The Element Declaration defines the element name and maybe it’s data type.
There are two types of element declarations:
The following declaration defines an element that contains date information:
<xsd:element name=”date” type=”xsd:date”/>
Here is an example of a Complex Element (customer) that contains an attribute (lastName):
<xsd:element name=”customer”>
<xsd:complexType>
<xsd:attribute name=”lastName” type=”xsd:string” use=”required”/>
</xsd:complexType>
</xsd:element>
Four Content Models
The content model defines the type of content that can be contained in an element. Here they are:
Text: Defines that the element can contain only text. Ex. <xsd:element name=”customer” type=”xsd:string”/>
Empty: Defines an element that can’t contain text or elements.
Mixed Content Model: The element can contain child elements and text.
Element: Defines an element that contains other child elements.
Ex.
<xsd:element name=”customer”>
<xsd:complexType>
<xsd:sequence>
<xsd:element ref=”firstName”/>
<xsd:element ref=”lastName”/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Here I used the xsd:sequence element to surround the child elements. This will specify the order that the information is entered being firstName then lastName. This is referred to as a compositor element. There are three compositor’s available to you:
Attributes
An Attribute of an Element is declared with just a name and type like this:
<xsd:attribute name=”custID” type=”xsd:positiveInteger”/>
You can assign a set of Attributes to more than one element by creating an Attribute Group. This would allow you to use the group of Attributes easily. You will have to declare this group globally at the top of the schema file. Here is how you would define an Attribute Group.
<xsd:attributeGroup name=”suffix”>
<xsd:attribute name=”BA” type=”xsd:string”/>
<xsd:attribute name=”MA” type=”xsd:string”/>
<xsd:attribute name=”PhD” type=”xsd:string”/>
</xsd:attributeGroup>
Whitespace in your XML file is normalized based on the value you declare. Here is an example of how you would declare how to handle whitespace:
<xsd:whiteSpace value=”preserve”/>
If you assigned the value of preserve to your white space definition, you are stating that you want all white space to remain untouched. Here are the other possible values you could assign to whitespace:
Define your own Data Types
You can create your own data types quite easily. Here is an example of how you could define a price data type. It would only allow for a maximum value of 999.99, because I maxed out the total digits at 5 and the total decimal places at 2. Ex.
<xsd:element name=”price”>
<xsd:simpleType>
<xsd:restriction base=”xsd:decimal”>
<xsd:totalDigits value=”5”/>
<xsd:fractionDigits value=”2”/>
</xsd:restriction>
</xsd:simpleType>
<xsd:element>
Annotations
If you want to leave notes that provide additional information on your file annotations provide that capability. Here is an example:
<xsd:element name=”customers”>
<xsd:annotation>
<xsd:documentation xml:lang=”en”>
This is a list of customers.
</xsd:documentation>
</xsd:annotation>
</xsd:element>
Just think of this as a comment. Here we surround that comment with the documentation tags, which are surrounded by the annotation tag.
That’s All Folks
That is pretty much all that goes into creating XML Schema’s. Of course practice will ingrain this information. If you have any questions leave them below in the comments section.
Till Next Time
-Think Tank