Note that all external links will open up in a separate window.
This is a stripped down version of these pages for older browsers. These pages are really meant to be viewed in a standards compliant browser.
Moving on to XML
These tutorials are about XML, the Extensible Markup Language.
So then, what is XML?
XML is a Markup Language
markup: an encoding system embedded directly in a document to indicate how that document should be formatted
XML stands for extensible markup language. In other words, it is a markup language who advertised primary feature is its extensibility.
Even before computers, editors had markup languages that were pretty much standard across the publishing industry. They would mark whether to bold-face or italicize some text, how the text should be aligned on the page, or how the page was to be laid out.
With the advent of computers, this methodology was a natural way to adapt computers to the task of printing formatted documents. XML is a continuation of this tradition.
XML is Extensible
XML works a little differently than markup languages like its predecessor HTML. HTML is a static markup language. It cannot be changed except by changing the programs that process it.
extensible: an extensible language is one that comes with rules on how to modify with language without violating standards for that language
XML is extensible. This means that it is specifically designed to allow for modifications and additions to the language.
In fact, XML is not a true markup language in the strict sense. It is just a set of rules on how XML documents should work. The purpose of XML is not to mark up documents itself, but rather to specify the rules by which to use markup to mark up documents. In other words, it is a markup language whose purpose is to define other markup languages.
meta-language: a language that is used to describe or define other languages
XML is what is known as a meta-language. A meta-language is a language that is used to define other languages. Meta, in this context means something that is self-referential, or self-defining.
Meta-languages are easy to understand. If I say, "the cat is on the chair", then I am using language. If I say, "in the sentence 'the cat is on the chair', 'cat' and 'chair' are nouns", then I am using meta-language. I am using a set of terminology that describes the language I am using. In this case, "sentence" and "noun", are words in our language that describe something about the language itself. They are part of a meta-language we refer to as English grammar.
application: defining a subset of a range of possibilities for some tool to perform a specific task
What you use XML for is to create extentions to XML. When you use XML to create an XML-based markup language, that language is said to be an application of XML. Since the term application also refers to the tools used to process XML, another term for an application of XML is an XML vocabulary. A vocabulary is a set of XML elements and attributes that are defined as being part of a specific XML application.
One of the languages thus defined through XML is XHTML, or the Extensible Hypertext Markup Language. It is a redefinition of HTML in XML. XHTML is one application of XML. Another application of XML is MathML, which is markup language for denoting mathematical formulae.
vocabulary: a collection of words, terms, or tokens that are defined as a set for some specific use
Any time you do something with XML, you are defining a vocabulary and creating an application of XML, or using an existing vocabulary defined by someone else.
The way you create an XML application vocabulary is by developing a set of labels that you want to use as your markup commands and then telling the computer how to process those commands. Put simply, in XML, you make up your own markup rules and then tell the computer how to work with what you have created.
Obviously, there are limits to how far afield you can go in making up your own rules. XML is a set of rules that defines the limits of the rules you create for your XML application. It is also the set of tools that you use to tell the program using your XML documents how to process them.
XML is a Set of Standards
How XML is supposed to work is determined by a set of standards. In computing terms, a standard is how someone should implement programs, or in the case of markup, how someone should use the mark-up language in question.
XML is an odd creature because it is a set of standards for how to create extensions.
Extensions are special features added to an application of some programming language for use specifically in that application. The intent of extensions is to make a particular application do more interesting things than the competition so that people will want to use the application. They are also intended to customize an otherwise generic application for a specific task in a specific setting.
extension: a vendor specific addition to the standards. Extensions are usually proprietary and should be avoided if possible
Extensions have been great for the purpose of advancing the technology, but not so good in terms of promoting usability and compatability. What makes for a good networked technology is broad platform compatability. Good networked technologies are not about the toys that come with them, no matter how nice they are.
What XML does is solve the compatability problem with extensions by developing an application with a specific set of rules on how extensions should be implemented. These rules promote cross-platform compatibility and application usability. People realized that if extensions are going to happen no matter what, they least we could do is standardize the way in which they are implemented. Thus XML is all about creating extensions, but is aimed at extensions that are non-proprietary and fully cross-platform compatible.
On of the big challenges facing XML as a young language is that many companies are creating extensions of XML that aren't entirely in line with XML standards or compatible with each other. Ironically, this is exactly the sort of thing XML was designed to avoid. Hopefully, this is a phase in the development of XML that will see the adherents to standards come out on top.
W3C: the World Wide Web Consortium is an organization formed of interested parties who review and set standards for protocols and programming languages for use on the World Wide Web
XML standards are set by the [World Wide Web Consortium], or W3C. W3C is a governing body composed of dues paying members most of whom represent large corporation with a vested interest in the development of XML and other languages and tools related to the World Wide Web.
The [W3C XML 1.0 Specification] is the current definitive set of standards for XML. It includes rules on how to specify all components of an XML document or application.
Since XML is a language with many components, it is only one of many XML standards, but it is the core standard to which all other XML standards must adhere. If you are serious about XML, then you should work to familiarize yourself with the documentation on XML and related materials that are maintained by the W3C.
XML is SGML
SGML: the Standard Generalized Markup Language is the definitive markup language for computing
The Standard Generalized Markup Language, or SGML, is a standard document markup language that defines nearly every known way of marking up text for the computer. It allows documents to be marked up for display in most any language and most any medium. In the 1980s, it was the way in which documents were written in most any large scale situation where documents needed to be shared across multiple platforms and multiple media.
Enquire: a program created by Tim Berners Lee that allowed him to track relationships between documents. It was similar to [Hypercard], a language developed by [Apple] for creating interactive help menus.
In the late 1980s, A man by the name of Tim Berners-Lee came up with an idea of combining it with a program he created called Enquire to adapt it for use as a means of marking up scientific documents for sharing over networks.
SGML is a very complicated language. It is difficult to learn. It also requires a fair amount of processing power and expensive programs meant for multi-user networks in order to use it.
Tim Berners-Lee realized that he only needed a small piece of the SGML standard to do what he wanted. In fact he wanted to create something as easy to use as possible so that he and his scientist buddies could focus on the content of the documents, not writing the code to make them display properly.
The result of his work was HTML, the now familiar Hypertext Markup Language.
HTML is a subset of SGML. It is a portion developed for creating electronic documents for online viewing. By using only a small subset of SGML in creating HTML, there is less to learn so it is easier for people to use. It also takes less time, memory, and disk space, for a computer to process HTML.
HTML is a very useful language, for what it does, but people quickly began to see its limits. HTML was designed specifically for people to share text documents online. Although Web browsers go well beyond HTML in what they support, everything beyond HTML in a Web page is a proprietary extension of some sort or another.
This is great for Web pages, but people saw the benefits of something better. People began to want something with the power and utility of SGML that was easy to use and designed specifically for online documents and data exchange. The objective was to be able to develop content not only usable across platforms, but also comprehensible across multiple media, including programs that were processing the documents for something other than human consumtpion.
The answer to this need was XML.
XML is also a subset of SGML. It is the portion of SGML that is used to develop online document and to handle online data exchange. Although much harder to learn than HTML, it is much easier than SGML. More importantly, XML consumes less overhead than SGML. It is a tight language, with, as they say, 80% of the functionality with only 20% of the complexity.
One benefit of this adaptation of SGML is that XML is fully backwards compatible with SGML. Since XML is a stripped down version of SGML, the compatibility does not, unfortunately, run so smoothly the other way.
XML is a Good Online Document Language
XML was not developed to perform a specific task, but to be a generic development tool for online content, either for the purpose of generating documents for viewing or for data exchange. Thus it was developed around what a good online content language is, not what works best for a specific application.
So what goes into a good online document language? An ideal online document development language would the following features:
A good Web language is a language that requires strict adherence to standards and good coding practices. It should be easy to confirm that it is well-written, accurate, and complete. Web browsers should not have to second guess mistakes, but rather should be allowed to assume there are none and to be able to ignore or reject what it doesn't understand.
This means that the language needs to be standards driven, not market driven. In order to achieve this, standards need to be flexible, or account for the need for flexibility. There must be a built in mechanism for modifying the language.
The language should be descriptive. Structural markup languages assume humans reading a document in a visual medium. This does not account for people using other media, or computers trying to parse the document for a purpose other than screen display. Descriptive markup overcomes this by actually stating what the nature of the content is in the code, not just delimiting it based on its mode of presentation.
The language should be powerful but simple. It should be easily comprehensible by both computers and people. SGML is very powerful, but very complex. HTML is very simple but not very powerful. Something in between is needed.
The solution to this set of goals was the development of XML, or Extensible Markup Language. It meets all of the above criteria except one. It is still a very complex language. On the other hand, since it is a meta-language, it can be used to define other, much easier to use, markup languages.
So, how can XML be so simple that it can be covered it a few pages (as I have said elsewhere), yet so complex that it can be difficult to learn. Well, it is really easy to learn the basics. It is only after that that you need to fear drowning in information.