XSLT Best Practices

XSLT (Extensible Stylesheet Language Transformations) is a functional language for transforming XML documents into another file structure such as plain text, HTML, XML, etc.  XSLT is available in multiple versions, but version 1.0 is the most commonly used version.  XSLT is extremely fast at transforming XML and does not require compilation to test out changes.  It can be debugged with modern debuggers, and the output is very easy to test simply by using a compare tool on the output.  XSLT also makes it easier to keep a clear separation between business and display logic.

Uses

XSLT has numerous uses.  XML is easy to generate and can easily be transformed to the desired layout of other systems.  Many older EDI systems need to receive data in a fixed, flat file format.  One such example of a fixed file format is the ABA file format used in the banking industry of Australia.  XSLT can be used to transform your data source to a flat file format for another system to consume, and that same data source can then be used to transform the data into HTML for display in a web browser.  In fact, it’s even possible to use XSLT to build an XSLT view engine for use with MVC to render content.

Another use for XSLT is creating dynamic documents in various formats such as Word, Excel, and PDF.  Starting with Office 2003, Microsoft began supporting the WordML and ExcelML data formats.  These data formats are XML documents that represent a Word document or an Excel spreadsheet.  Data from a database can be easily transformed into either of these formats through the use of XSLT.  In addition, the same data source can also be transformed into XSL-FO to create PDF documents.

Besides the two uses above, you may want to consider using XSLT whenever you are working with templates, when you are working with XML data, or when you are working with static data that doesn’t need to live in a database.  An example of a template would be an email newsletter that gets sent out and is “mail-merged” with data from the database.

Of course there are times that you could use XSLT to accomplish a programming task, but it might not be the right choice.  For instance, it might be easier to use LINQ to access data from an object hierarchy and then use a StringBuilder to build output rather than to use an XSLT to do the same thing.  An XSLT might also not be appropriate for generating output if you need to do a large amount of string manipulation.  Having to use certain string functions like replace or split are not as easy to accomplish in XSLT as they are in languages like C#.

Basics

Assuming that XSLT is the right solution for the task you are trying to accomplish, there are several basic things that a developer needs to be aware of. The first thing to remember is that XSLT is a functional language. Once a variable is set it cannot be changed. In order to change a value, you need to setup a template that you can call recursively. The following is an example of what that code might look like:

 <xsl:template name="pad-left">
     <xsl:param name="totalWidth"/>
     <xsl:param name="paddingChar"/>
     <xsl:param name="value"/>

     <xsl:choose>
         <xsl:when test="string-length($value) &lt; $totalWidth">
             <xsl:call-template name="pad-left">
                 <xsl:with-param name="totalWidth">
                     <xsl:value-of select="$totalWidth"/>
                 </xsl:with-param>
                 <xsl:with-param name="paddingChar">
                     <xsl:value-of select="$paddingChar"/>
                 </xsl:with-param>
                 <xsl:with-param name="value">
                     <xsl:value-of select="concat($paddingChar, $value)"/>
                 </xsl:with-param>
             </xsl:call-template>
         </xsl:when>
         <xsl:otherwise>
             <xsl:value-of select="$value"/>
         </xsl:otherwise>
     </xsl:choose>
 </xsl:template>
 

The template above performs the equivalent function of the pad left function in .Net.  The pad-left template takes in three parameters.  It then checks to see if the length of the value passed in is less than the total length specified.  If the length is less then the template calls itself again passing in the value passed to the function concatenated with the padding character and the desired length.  This process is repeated until the value passed into the template is greater than or equal to the string length passed into the template.

Another important thing to know when working with XSLT is that namespaces affect how you select data from XML. For instance, let’s say you’re working with XML that starts with the following fragment:

 <FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">
 

In order to select data from this XML document, you need to include a reference to the namespace(s) used in the XML document that you are consuming in your XSLT. For the example above you would do something like this:

 <xsl:stylesheet version="1.0"
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
                 xmlns:fm="http://www.filemaker.com/fmpxmlresult"
                 exclude-result-prefixes="msxsl fm">
 

 <xsl:template match="fm:FMPXMLRESULT">
     <xsl:apply-templates select="fm:RESULTSET" />
 </xsl:template>
 

The last area I would like to focus on is the use of templates. XSLT provides two techniques for accessing data. The push approach, as the name implies, pushes the source XML to the stylesheet, which has various templates to handle variable kinds of nodes. Such an approach makes use of several different templates and applies the appropriate template for a given node through the use of the xsl:apply-templates command.  An example of this is as follows:

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:template match="Orders">
     <html>
       <body>
         <xsl:apply-templates select="Invoice"/>
       </body>
     </html>
   </xsl:template>
   <xsl:template match="Invoice">
     <xsl:apply-templates select="CustomerName" />
     <p>
       <xsl:apply-templates select="Address" />
       <xsl:apply-templates select="City" />
       <xsl:apply-templates select="State" />
       <xsl:apply-templates select="Zip" />
     </p>
     <table>
       <tr>
         <th>Description</th>
         <th>Cost</th>
       </tr>
       <xsl:apply-templates select="Item" />
     </table>
     <p />
   </xsl:template>
   <xsl:template match="CustomerName">
     <h1><xsl:value-of select="." /></h1>
   </xsl:template>
   <xsl:template match="Address">
     <xsl:value-of select="." /><br />
   </xsl:template>
   <xsl:template match="City">
     <xsl:value-of select="." />
     <xsl:text>, </xsl:text>
   </xsl:template>
   <xsl:template match="State">
     <xsl:value-of select="." />
     <xsl:text> </xsl:text>
   </xsl:template>
   <xsl:template match="Zip">
     <xsl:value-of select="." />
   </xsl:template>
   <xsl:template match="Item">
     <tr>
       <xsl:apply-templates />
     </tr>
   </xsl:template>
   <xsl:template match="Description">
     <td><xsl:value-of select="." /></td>
   </xsl:template>
   <xsl:template match="TotalCost">
     <td><xsl:value-of select="." /></td>
   </xsl:template>
   <xsl:template match="*">
     <xsl:apply-templates />
   </xsl:template>
   <xsl:template match="text()" />
 </xsl:stylesheet>
 

The pull approach on the other hand makes minimal use of xsl:apply-template instruction and instead pulls the xml through the transform with the use of the xsl:for-each and xsl:value-of instructions.  Using the pull technique, the above template would look something like this:

 <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <xsl:template match="Orders">
     <html>
       <body>
         <xsl:for-each select="Invoice">
           <h1>
             <xsl:value-of select="CustomerName" />
           </h1>
           <p>
             <xsl:value-of select="Address" /><br />
             <xsl:value-of select="City" />
             <xsl:text>, </xsl:text>
             <xsl:value-of select="State" />
             <xsl:text> </xsl:text>
             <xsl:value-of select="Zip" />
           </p>
           <table>
             <tr>
               <th>Description</th>
               <th>Cost</th>
             </tr>
             <xsl:for-each select="Item">
               <tr>
                 <td><xsl:value-of select="Description" /></td>
                 <td><xsl:value-of select="TotalCost" /></td>
               </tr>
             </xsl:for-each>
           </table>
           <p />
         </xsl:for-each>
       </body>
     </html>
   </xsl:template>
 </xsl:stylesheet>
 

You can read more about these two approaches at http://www.xml.com/pub/a/2005/07/06/tr.html and http://www.ibm.com/developerworks/library/x-xdpshpul.html.

Best Practices

While XSLT is extremely fast and powerful, there are several rules to keep in mind in order to write quality code. They are as follows:

  1. Avoid the use of the // near the root of the document especially when transforming very large XML document.  The // selector selects nodes in the document from the current node that match the selection no matter where they are in the document.  It is best to avoid using the // operator all together if possible.  More scanning of the XML document is required which makes transforms take longer and makes them less efficient.
  2. Avoid the use of very long xpath queries (i.e. more than a screen width long).  It makes the XSLT logic difficult to read.
  3. Set the indent attribute in the output declaration to off when outputting XML or HTML.  Not only will this reduce the size of the file you generate, but it will also decrease the processing time.
 <xsl:output method="xml" indent="no"/>
 
  1. Try to use template matching (push method) instead of named templates (pull method).  Named templates are fine to use for utility functions like the padding template listed above.  However, template matching will create cleaner and more elegant code.
  2. Make use of built in XSLT functions whenever possible.  A good example of this is when you are trying to concatenate strings.  One approach to accomplish this would be to utilize several xsl:value-of instructions.  However, it is much cleaner to use the xsl concat() function instead.
  3. If you are transforming a large amount of data through .Net code you should utilize the XmlDataReader and XmlDataWriter classes.  If you try and use the XmlDocument class to read in your XML and the StringBuilder class to write out your XML you are likely to get an Out of Memory exception since data must be loaded in one continuous memory block.

Additional best practices can be found here:
http://www.xml.org//sites/www.xml.org/files/xslt_efficient_programming_techniques.pdf
http://www.onenaught.com/posts/23/xslt-tips-for-cleaner-code-and-better-performance

Conclusion

There are many times to consider using XSLT. The language tends to be verbose and at times it can feel unnatural to program in if you are more accustomed to a procedural programming style. However, it is a flexible and powerful language that with a little time can be easy to pick up and learn. There are debugging and profiling tools available to make the development process easier. In addition, changes to an XSLT does not require compilation in order to test, which can easily be done by comparing output with a compare tool such as Araxis Merge.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: