eXtensible Markup Language (XML) is one of the current buzzwords in the Technical Communication industry. Many feel that XML will make as much of an impact on the Internet as did HTML. Articles have been published to provide overviews of—and peak interest in—XML. However, because XML's power also means complexity, it's difficult to explain everything you need to know about XML in one article. Instead of only focusing on the advantages of XML, this article gets you started on the road to actually learning XML by teaching you how to write a simple XML document, display it in a browser, and format it for viewing on the Internet.
The first thing you should know is that XML is not an object-oriented programming language like Java or C++, nor is it a scripting language like JavaScript or VBScript. In fact, XML contains no procedural or functional capabilities. It's a meta-markup language—a system for defining languages. In short, you create custom "tags" and then structure those tags according to certain principles.
XML's real power is that it separates content from presentation, which allows you to write one XML source document to display and use in a variety of ways. While HTML presents visual and auditory elements, XML describes content. For example, in HTML the heading 2 tag only determines how to display text. An XML tag particularizes specific information. After XML delivers data to the desktop, the data is parsed and can be edited and/or presented in multiple views without return trips to the server. XML can also describe data in a variety of applications allowing for a language and platform-neutral way to exchange data.
Why should you learn XML? This question is similar to one that was asked about five years ago in our industry, "Why should I learn HTML?" As e-commerce continues to grow, so will XML. A wide variety of XML-based languages are already in use. This includes Channel Definition Format (CDF), Resource Description Framework (RDF), and Chemical Markup Language (CML). As e-commerce companies begin using XML to exchange information from server to server, Technical Communicators will be needed to work with XML in some form.
<?xml version="1.0" standalone="yes"?> <STC_CONFERENCE> <CONFERENCE_TITLE> Renaissance Communicators - A Vision of Our Future </CONFERENCE_TITLE> <LOCATION> <CITY>Orlando</CITY> <STATE>Florida</STATE> </LOCATION> <CONFERENCE_DATES> <MONTH_NAME>May</MONTH_NAME> <START_DATE>22</START_DATE> <END_DATE>25</END_DATE> <YEAR>2000</YEAR> </CONFERENCE_DATES> </STC_CONFERENCE>
The first line is the XML declaration. It's a processing instruction that states the document is an XML file and that it's a standalone document (it doesn't need to read other documents to function). The <STC_CONFERENCE>
tag is the root element. All of the subsequent tags are either its child or descendant elements. By looking at this data, you can see that it has a hierarchical structure. The
After you incorporate the markup, the file must be saved as an .xml file. You can then open the document in a browser that supports XML documents (such as IE5), as seen in Figure 2:
Figure 2 - XML Source in Browser
Clicking a parent element expands or collapses the element to show or hide its child element(s). If the browser does not display the document, it means the document cannot be parsed because it isn't well-formed (correctly marked up according to XML standards).
Although the document is properly structured, it's not aesthetically pleasing. So how do you format it so that it looks like nice? With style sheets, which contain the rules for displaying an XML document (or instance) in a browser. What's beneficial about this is that you can present an XML document in several different ways just by changing the style sheet. Or, you can create many different style sheets to display a single XML document for specific purposes (print, screen display, PDAs, etc.).
You can do this by creating a Cascading Style Sheet (CSS), or by using eXtensible Style Language (XSL).
Figure 3 - CSS Source Code
<LOCATION>
element, all of its child elements use the associated format. Rules can also be assigned to a single child element, such as the <CITY>
element. After the syntax is incorporated into the text editor, save the file as a .css file. However, the .css file must know which document to format. To do this, you must attach the style sheet to the XML document by opening the XML instance and inserting the following directly under the XML declaration:<?xml-stylesheet type="text/css" href="filename.css"?>
This is a processing instruction that instructs the browser to apply the style sheet. The type attribute is the MIME type of style sheet. In this case its value is text. The href attribute's value declares where the style sheet is located. After saving the changes to the .xml file (and refreshing your browser if necessary), the document is displayed as seen in Figure 4:
Figure 4 - XML+CSS
One advantage of CSS is that it's fairly easy to learn and can format content for hardcopy and the Web as nicely as any layout application on the market. However, because CSS is not an XML language, you'll have to locate other resources to learn its syntax. You'll need to understand things like classes, psuedo-elements, display properties, etc., to really make your XML documents sing with CSS. Another advantage to using CSS is that major browsers support it. Even though CSS works as well with XML than HTML, it doesn't allow you to really use XML to its full capacity. There is a style language for XML documents that's superior to CSS. It's called XSL.
For example, Figure 5 is basic XSL code to display the XML instance in a browser.
Figure 5 - XML+XSL-T
There are myriad of ways that this could have been presented. However, as with anything new, it's best to begin with something simple. When using XSL, you'll need to think of your input data as individual nodes. When scripting against your source document, if you don't have a clear visual representation of your data, you might find it difficult (if not impossible) to transform your data. Therefore, you should consider creating a tree diagram of your XML instances to use as a reference when marking up your XSL style sheet.
After reading the preceding section, "Using CSS", you have a basic understanding of how style sheets can be used to define formatting rules for a complete set of tags. XSL-FO also allow you to define rules for your tags and do more with the layout than you can with XML+CSS. This includes cross-references, page numbers, etc.
XSL formatting objects are an XML vocabulary that you use to place your XML elements in a document. As with XSL-T, an XSL-FO document must be well-formed. There are over 50 XSL-FO elements and close to 200 formatting properties that can be used for these elements. The formatting properties for the objects are (by no accident) similar to the formatting properties of CSS. This formatting model is based on areas (similar to the box model used for CSS) that can contain other formatting objects, space, or text.
The formatting objects for XSL use the http://www.w3.org/XSL/Format/1.0 namespace. To begin your XSL-FO style sheet, use the following declaration:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl" xmlns:fo="http://www.w3.org/XSL/Format/1.0" result-ns="fo">
Although XSL-T and XSL-FO can function independently, you would be losing the separation of content and presentation by using only XSL-FO. Consequently, to take full advantage of XSL-FO, you would write your XSL style sheet to use XSL-T to transform your XML instances into XSL-FO vocabulary.
Figure 6 is an XSL style sheet that does just that.
Figure 6 - XML+XSL-FO/T
Look at the source code for this document and you'll notice it was written with XHTML+CSS. If you are up to the challenge, convert it to XML+CSS and then XML+XSL.
If you need help, or have general questions and/or comments, please feel free to contact me at: jprince@e-talkcorp.com.
Copyright © 2000 John Richard Prince