The X-Factor: Implications for Internet Programming Today and Tomorrow Roy A. Boggs Computer Information Systems, Florida Gulf Coast University Ft. Myers, Florida, 33965, USA Abstract Internet programming is leaving the simple and sometime loose programming path of the HTML era. Newer structures are more disciplined, browser and platform independent, and demand a new set of skills and programming knowledge. Using a simple HTML table, the following pages present an overview of the transformations that will soon be used to display this table and its contents. Structures, with X- names, from (X)HTML through XML, XSL(T) and XSL(FO), are presented in programming examples. The result is a demonstration, based upon examples, of where the X-Factor is taking Internet programming. Implications for practice and pedagogy are then summarized at the end of the paper. Keywords: Internet Programming, Internet Development, XML, XHTML, XSL WHERE ARE WE GOING? There is hardly a discipline that is changing and advancing as fast as Internet design, development and management. A multidirectional process moving back and forth between disciplines and Internet structures has brought this about. Everyone is affected or influenced in some form by material available on the Internet. At the same time, Internet development finds itself faced with an ever-increasing myriad of new structures, newer versions of browsers with increased capabilities, more complex platforms, faster delivery systems, and demands that everything respond equally quickly and equally as well. HTML alone won't do it anymore. Even if current versions of some browsers respond to weak code in a favorable manner, this situation is not likely to continue very much longer, this is especially true if a set of web pages is to enjoy widespread use. The Internet world is moving, or being pulled, into assimilating a set of recommendations that is designed to free Internet web pages from constrictions imposed by browsers, scripts, and platforms. These recommendations are developed and shepherded by the World Wide Web Consortium: W3C (http://www.w3.org/). All of the various pieces of the current and the new can be found there. The information is often overly technical. However, it is possible to cull tendencies and directions. It is also at times in this process useful to take one example and follow the recent developments. This is what is done in the following review - as sort of an answer to the question: where are we going? The various programs and examples are intended for those with an interest in Internet programming and who have some experience working in the Internet environment. They present points of reference and nothing more. Those with only a passing interest in Internet development may read the text, leaving the examples for those who prefer reading code. For this version, complete examples are given. Examples assume current browsers, or any plug-ins for older browsers FROM HERE TO THERE The ultimate goal of the review below is to show the direction the X-Factor is taking Internet programming. The X-Factor includes those programming and data structures beginning with the letter X: (X)HTML, XML, XSL(T), XSL(FO), etc. However, it will be seen in the examples that HTML code is not being eliminated. Quite the opposite, it is being given a disciplined hierarchical structure and then wrapped in constructs that makes it language-, browser-, and platform-independent. The first key word in this process is 'deprecation'. Older tags, and some tags currently enjoying widespread usage, are being discouraged and will ultimately be discontinued. Three other important key words are 'well-formed', 'valid' and 'namespace'. 'Well-formed' indicates that a document's structure satisfies all of the rules that make it highly structured and predictable. A document that is not 'well-formed' will not be displayed. 'Valid' indicates that it satisfies all of the rules for a particular document, specifying which elements and attributes are allowed or required and in what format. Documents that are not 'valid' may not be displayed until corrections are made. 'Namespace' indicates a collection of related element names. They are often distinctively marked (for example: xsl:name, fo:name) to prevent collisions between similar names in related documents. The review begins with HTML, and then passes through (X)HTML to XML and XSL. There are other topics that might have been included, such as XPATH. However, the main topic is XML. It is the main path. First, from the beginning. BASIC HTML Employee Directory Smith, James James@OurFirm.com Jones, Jill Jill@OurFirm.com The table above contains the caption 'Employee Directory' along with two data instances of 'name' and 'email' for each of two employees. The caption contains a larger and bolder font. A table is used here as the example because tables represent basic units for creating and displaying documents, and for managing the display of objects. They may well displace frames, especially since frames are regarded as special structures in (X)HTML; and they may themselves eventually become replaced by Cascading Style Sheets (CSS) positioning elements and '*.inc' files. HTML stands for Hypertext Markup Language. A basic HTML set of code for this table would be as follows. The various structure show the data are to be displayed.
Employee Directory
Employee Directory
Smith, James Smith, James James@OurFirm.com James@OurFirm.com
Jones, Jill Jones, Jill Jill@OurFirm.com Jill@OurFirm.com
The code on the left can be executed on all current browsers. It is 'well-formed'. The hierarchical structure is maintained. For the sake of the examples below, a CSS
Employee Directory
Smith, James James@OurFirm.com
Jones, Jill Jill@OurFirm.com
This is an XML (eXtensible Markup Language) document and must by definition be 'well-formed'. 'iso-8859-1' encoding is simply used here to ensure a larger character set. The DOCTYPE identifies the document as an (X)HTML document that is strictly 'well-formed'. The choices are 'strict', 'transitional' and 'frames'. 'Strict' is preferable unless it is deemed necessary to use a tag even though it will eventually be deprecated or unless a Here again, while the XML document is language and platform independent, the scripting process is language dependant and it is also platform dependant. As stated above, it takes programming skill to work with such structures. The answer is to side-step formal code and to apply XSL(T) templates. The coding is straightforward and independence is built-in. XSL(T) AND XML FILES The Extensible Stylesheet Language for Transformation (XSLT) transforms and renders XML documents into web pages. The possibilities for using XSLT are extensive and often complex. A possible XSLT file is as follows. It is named 'xmlXslEmployee.xsl'. The CSS
Employee Directory
Name Email Address
An XML file is then used to process the file: xmlEmployee.xml: James Smith james@OurFirm.com Smith, Jill Jill@OurFirm.com The addition of an XML processing instruction to the XML file above () identifies the location of the XSLT file and is sufficient to produce the desired output with limited code via a process that will in effect be independent of current browsers, scripts, and platforms. (This is not always the case as some current browsers do yet fully support XSLT without downloading and installing relative modules.) XSL(FO) AND XML FILES A final and important step is an ability to render XML documents with extensive formatting: to a display, to a printer, or to a *.pdf file. The formatting objects for these processes are presented as Extensible Stylesheet Language Formatting Objects (XSL-FO). The following example expands on the XSLT code above and the XML remains the xmlEmployee.xml file. Employee Directory The expected namespace is 'fo' and it encapsulates an extensive and sophisticated page-markup language. Current browsers do not yet fully support XSL-FO. However, these exist several freeware sources that will convert these documents. The desired table below is the same as that for the more complex, dependent processes above. Employee Directory Smith, James James@OurFirm.com Jones, Jill Jill@OurFirm.com SOAP The final step is to create a means of shipping XML documents freely across platforms. Simple Object Access Protocol (SOAP) is a 'protocol for exchanging information in a decentralized, distributed environment'. It is designed to permit XML formatted files to be exchanged independent of the sending or receiving platform. This step wraps the XML file in a SOAP envelope for shipping. The assumption, of course, is that the corresponding XSL(T/FO) file resides at the destination. SOME IMPLICATIONS What does one learn from this? The first, and one of the most important lessons here is that HTML is not going away quickly. The initial step to Internet programming is still HTML. The second step is still a solid knowledge of the latest version of CSS. Basic Internet coding, such as the use of tables, continues with HTML structures, but all descriptive, and positioning elements are reserved for Cascading Style Sheets. However, programming now comes in the form of XML structures. Implications suggest that from the beginning one should learn to create 'strict' (X)HTML documents. This means that CSS should be a part of initial programming efforts, and that the resulting code must be 'well-formed' and the document 'valid'. All CSS and (X)HTML documents then need to be validated. This is a must and a necessary step to ensure browser functionality. The next steps lead from (X)HTML/CSS to XML and to XSL(T/FO). The learning progression is almost build-in. XML by itself will render web pages and the schemata will display data in various formats. Soon, (X)HTML and CSS may not be needed in an applications environment. (However, one can assume that quick coding in (X)HTML and CSS will be around for a while.) XSL(T/FO) render web pages and *.rtf files respectfully. They represent a necessary final step in the learning process. What is missing is the ability to easily access databases, using XSL, Xquery or XQL, to pull and format the data with the proper tags into an XML file. Until this happens scripting remains a necessary tool. So there is indeed more to come. The bottom line is still - as before - users must first know and understand their data and their data structures. Otherwise, the X-factor, however designed, may produce unintended consequences.