Learn XML eXtensible Markup Language

In the next set of tutorials, I’ll completely cover XML and all of it’s capabilities. XML stands for eXtensible Markup Language. I’ll get back to the extensible part in a minute.

First you have to understand that XML is no more a programming language than HTML is. Let me repeat that. XML is not a programming language, it is a Markup Language that is very similar to HTML. If you don’t understand HTML, stop reading this article and instead read or watch my HTML W3C Tutorial.

People really struggle to understand exactly what XML is and I’m going to explain it in an unconventional way that has never failed in the past. I’ll explain XML, by comparing it to HTML.

What do you do with HTML?

HTML is used to perform some very specific and limited tasks for you:

  • Provide web developers with an easy way to link to their web pages, as well as to the pages of others.
  • Describe the basic structure of the web page, concerning placement.
  • Tightly control how the document is displayed.
  • Allow the information on the page to display across a wide variety of web enabled devices.
  • Provide information in a wide variety of media formats.

HTML defines how elements of the web page are structured, by surrounding them with HTML tags. There are a limited number of tags available for your use, however. Since CSS and XML have caught on, there are even fewer tags available in HTML.

I used two jargon terms that I’ll explain:

  • Tag – This is a paragraph tag <p>, it is used to surround paragraph’s of text. You define that the paragraph has come to an end by placing the closing paragraph tag after the last character in the paragraph. The closing paragraph tag looks like this </p>
  • Element – The whole paragraph, tags an all is known as an element. It would look like this: <p> Here is the whole paragraph…</p>

How are HTML & XML Similar and Different?

XML is also a markup language and uses tags to construct a very specific document structure. Both also use attributes in their tags.

The real difference is that XML just facilitates the exchange of data. It is not limited to the web and can be opened in any software package that recognizes it and there are many.

While HTML defines how text and other elements are displayed in a browser, XML tells the browser what the elements mean. XML doesn’t care how the elements are displayed. XML separates the content into smaller descriptive pieces, while HTML and CSS worry about the presentation.

XML also is not limited to any fixed set of tags, or types of elements. If you want to create a tag called customers, create <customers>. You can then define different tags that further organize data related to your customers. Then use CSS to define the styling of this new tag.

XML does one thing really well. It organizes and classifies your data, so that it is useable by a wide range of devices and software packages! You can think of it as a database of sorts if that helps you understand it better.

Why Do People Use XML?

The following is true if you give your tags meaningful names and organize that information logically.

  • Allows you to easily transfer data between multiple different devices and software platforms. It’s kind of like a universal translator. Advanced XML is set up to facilitate transfer of information in multiple different languages and character sets.
  • You can make as many different types of tags with your specified attributes as you need.
  • The data is easy to extract from XML, because it is so well organized.
  • XML works harmoniously with powerful web technologies such as HTML, CSS, JavaScript, Databases, etc.
  • XML data is easy to understand, because of the focus on organization.
  • Search engines can provide the best results when deep XML definitions have been applied to data.
  • Provides web developers with the ability to change data site wide, instead of on a page by page basis.

What Does XML Look Like?

Here is a sample of what a customer XML file might look like:

<?xml version=”1.0” standalone=”yes” encoding=”UTF-8”?>

<?xml-stylesheet type=”text/css” href=”customers.css”?>

<customers>

<customerId>

<firstName>John</firstName>

<lastName>Smith</lastName>

</customerId>

</customers>

The first line in every XML file is called the XML Declaration. The XML Declaration tells the device opening it that this file is XML complaint. It must always be the first line.

The attribute standalone, when given the value of “yes”, states that this document is not dependent on any other file to be considered complete. This file could use an outside CSS file and still be considered complete. This is concerning whether all of the content is included in the XML file.

The encoding attribute will specify what character encoding was used to create this XML file. This is done so the information is displayed properly.

With the second line of code you are defining the location of the CSS file you wish to use to style your XML data. You would style HTML in the same way you style HTML. Example:

customers {

background-color: gray;

font-family: Verdana, Geneva, Arial, Helvetica, sans-serif;

}

Breaking Down your Content

You would then break the content down into categories and subcategories. This can sometimes be harder than you think. You must spend a lot of time thinking about what the key components of the content are and how you should structure it to make it useful.

Reading my article on making data atomic might help you down the right path?

Content in XML is broken down into two groups being data and text intensive. When I talk about data I’m referring to information that would normally be found in a database. Like a database you would want to match up this data with a unique identification number.

When I talk about the text content, I’m referring to links, text, and collections of words.

After your opening XML declaration you will define the root element. The root element will contain all other elements in the xml file. All of your XML markup will lie between it’s opening and closing tags.

All other elements within the document are then labeled based off their contents, or the elements that contain them. For example:

  • Parent Elements: Any element that surrounds other elements
  • Child Elements: The element that lies in the Parent
  • Sibling Elements: Two or more elements that lie in the same parent.

That’s All Folks

It’s been a blast, but I have to end the article for now. If you have any questions, leave them below. Come back tomorrow when I’ll continue teaching everything I know about XML.

Till Next Time…

Think Tank

2 Responses to “Learn XML eXtensible Markup Language”

  1. Niraj says:

    That was very neat and clear. Thanks Derek!

Leave a Reply

Your email address will not be published.

Google+