Learn XML Schema

In this article I’ll describe the most complicated aspect in XML. A XML Schema is another way you can represent how a XML document file defines data, elements and attributes.

How Schema’s Differ from DTD’s

Unlike DTD’s, which are great when you are mainly working with text elements, Schema’s work best when your mainly working with data elements.

Schema’s also work better when you want to be certain that the proper data is entered. Schema’s define what type of data each element and attribute can hold. For example, with a XML Schema you can specify you will only allow a price element to be of the form of a certain number of numbers followed by a period and then two additional digits.

Also Schema’s must be located in a completely separate file outside of the XML, unlike DTD’s. In that separate file it defines the elements and attributes work together to define the content.

The 47 Schema Data Types

Schema’s use their own defined data types, which you can use as a base to create even more complicated data types. Their currently are 47 data types available at this time. I’ll list them here, but will only cover the most commonly used in detail:

  • Namespaces
  • boolean
  • base64Binary
  • hexBinary
  • anyURI
  • language
  • normalizedString
  • string
  • token
  • byte
  • decimal
  • double
  • float
  • int
  • integer
  • long
  • negativeInteger
  • nonNegativeInteger
  • nonPositiveInteger
  • positiveInteger
  • short
  • unsignedByte
  • unsignedInt
  • unsignedLong
  • unsignedShort
  • date
  • dateTime
  • duration
  • gDay
  • gMonth
  • gMonthDay
  • gYear
  • gYearMonth
  • time
  • Name
  • NCName
  • QName
  • ID
  • anyType
  • anySimpleType

Most Commonly used Data Types

Here is a decription for the most commonly used XML Schema Data Types:

  • anyURI: You store a URL in this data type.
  • dateTime: With this data type you can define exactly how to want date and time to be represented.
  • decimal: A number that includes a decimal point. You can define the number of decimal places for it.
  • integer: Stores a number without a decimal point.
  • string: Stores a collection of characters.

The Parts of a Schema

Here are the different parts of a Schema:

  • The XML Declaration: Tells the XML interpreter which version of XML and character encoding is being used. Ex. <xml version=”1.0” encoding=”UTF-8”>
  • The Schema Element: Alerts the interpreter that this information is XML Schema. Ex. <xsd:schema xmlns: xsd=”http://www.w3.org2001/XMLSchema”/>
  • Element Declaration: Defines the element. Ex. <xsd:element name=”customers”>
  • Attribute Declaration: Defines an attribute. Ex. <xsd:attribute name=”attName” type=”xsd:string”>

The XML Declaration

What you are doing with these two lines of code is:

<xml version=”1.0” encoding=”UTF-8”>

<xsd:schema xmlns: xsd=”http://www.w3.org2001/XMLSchema”/>

saying that this document uses XML Schema. XSD stands for XML Schema Definitions.

The Rest of the Schema

Now you define the Elements and Attributes. You specify how these things work together, what elements contain other elements and the attributes of each element.


You must define all of your elements with an Element Declaration. The Element Declaration defines the element name and maybe it’s data type.

There are two types of element declarations:

  • Simple type definitions: These are elements that cannot contain any other elements and cannot include any attributes.
  • Complex type definitions: These declare elements that can contain other elements and can also take attributes.

The following declaration defines an element that contains date information:

<xsd:element name=”date” type=”xsd:date”/>

Here is an example of a Complex Element (customer) that contains an attribute (lastName):

<xsd:element name=”customer”>


<xsd:attribute name=”lastName” type=”xsd:string” use=”required”/>



Four Content Models

The content model defines the type of content that can be contained in an element. Here they are:

Text: Defines that the element can contain only text. Ex. <xsd:element name=”customer” type=”xsd:string”/>

Empty: Defines an element that can’t contain text or elements.

Mixed Content Model: The element can contain child elements and text.

Element: Defines an element that contains other child elements.


<xsd:element name=”customer”>



<xsd:element ref=”firstName”/>

<xsd:element ref=”lastName”/>




Here I used the xsd:sequence element to surround the child elements. This will specify the order that the information is entered being firstName then lastName. This is referred to as a compositor element. There are three compositor’s available to you:

  • Sequence: Use this to make sure all the child elements are entered in a specific order
  • Choice: This is kind of like a multiple choice answer. You use this if you want one of the child elements to be picked.
  • All: Indicates that any or all of the child elements may have data.


An Attribute of an Element is declared with just a name and type like this:

<xsd:attribute name=”custID” type=”xsd:positiveInteger”/>

You can assign a set of Attributes to more than one element by creating an Attribute Group. This would allow you to use the group of Attributes easily. You will have to declare this group globally at the top of the schema file. Here is how you would define an Attribute Group.

<xsd:attributeGroup name=”suffix”>

<xsd:attribute name=”BA” type=”xsd:string”/>

<xsd:attribute name=”MA” type=”xsd:string”/>

<xsd:attribute name=”PhD” type=”xsd:string”/>


Whitespace in your XML file is normalized based on the value you declare. Here is an example of how you would declare how to handle whitespace:

<xsd:whiteSpace value=”preserve”/>

If you assigned the value of preserve to your white space definition, you are stating that you want all white space to remain untouched. Here are the other possible values you could assign to whitespace:

  • replace: Forces all tabs, line feeds and carriage returns to be replaced with spaces
  • collapse: Forces all tabs, line feeds, carriage returns and spaces to be collapsed into single spaces

Define your own Data Types

You can create your own data types quite easily. Here is an example of how you could define a price data type. It would only allow for a maximum value of 999.99, because I maxed out the total digits at 5 and the total decimal places at 2. Ex.

<xsd:element name=”price”>


<xsd:restriction base=”xsd:decimal”>

<xsd:totalDigits value=”5”/>

<xsd:fractionDigits value=”2”/>





If you want to leave notes that provide additional information on your file annotations provide that capability. Here is an example:

<xsd:element name=”customers”>


<xsd:documentation xml:lang=”en”>

This is a list of customers.




Just think of this as a comment. Here we surround that comment with the documentation tags, which are surrounded by the annotation tag.

That’s All Folks

That is pretty much all that goes into creating XML Schema’s. Of course practice will ingrain this information. If you have any questions leave them below in the comments section.

Till Next Time

-Think Tank

Leave a Reply

Your email address will not be published.