views:

411

answers:

3

I have an existing XSLT stylesheet that takes XML and produces nicely formatted XHTML. I want to make an XSL-FO version of this stylesheet to produce PDF via Apache FOP. What I want to know is this:

Are there any convenient to use xslt patterns I need to learn to do things like:

  • copying some nodes unaltered
  • copying most of a node, but adding extra attributes

I know I can create new nodes using

<xsl:element>

but are there any other useful things I will need. Note that while I've not done a lot of copying from one XSLT format to another, I've done TONS of XML-> XHTML via XSLT so I'm familiar with most of the core of the language.

+3  A: 

The biggest obstacle to transforming XSLT is that the output namespace prefix is the same as that of the actual XSL instructions in your transform. If you use “xsl:” both in your XSL instructions and your output, your XSLT engine will not know the difference between the XSL instructions it should execute and those it should output, so your XSLT will not parse. That is, unless you use a namespace alias:

<xsl:namespace-alias stylesheet-prefix="x" result-prefix="xsl"/>

This instruction, which is placed inside of <xsl:stylesheet />, allows you to write your result markup in your transform using a substitute namespace prefix. Later, when the output document is created, the prefix that you actually want will be inserted in the alias’s place. So, for instance, here's a template that produces a template in your output document:

<xsl:template match="xsl:template[@match='title']>
   <x:template match="title>
      <x:apply-templates />
   </x:template>
</xsl:template>

Here's a good article: http://www.xml.com/pub/a/2001/04/04/trxml/

James Sulak
Well, xslt has an "another xslt namespace", you can as well use it!
alamar
http://www.w3.org/1999/XSL/Transform is a normal namespace, and http://www.w3.org/1999/XSL/TransformAlias is another one :)
alamar
+5  A: 

The pattern you're looking for is the "modified identity transform". The basis of this approach is the identity transform rule, the first template rule in the stylesheet below. Each rule after that represents an exception to the copying behavior.

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <!-- By default, copy all nodes unchanged -->
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <!-- But strip out <foo> elements (including their content) -->
  <xsl:template match="foo"/>

  <!-- For <bar> elements, strip out start & end tags, but leave content --> 
  <xsl:template match="bar">
    <xsl:apply-templates/>
  </xsl:template>

  <!-- For <bat> elements, insert an attribute and append a child --> 
  <xsl:template match="bat">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:attribute name="id">123</xsl:attribute>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

What's least satisfying to me about the above is the duplication of logic found in the last template rule. That's a lot of code for just adding one attribute. And imagine if we need a bunch of these. Here's another approach that allows us to be more surgically precise in what we want to override:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <!-- By default, copy all nodes unchanged -->
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:apply-templates mode="add-atts" select="."/>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

          <!-- By default, don't add any attributes -->
          <xsl:template mode="add-atts" match="*"/>

  <!-- For <bat> elements, insert an "id" attribute -->
  <xsl:template mode="add-atts" match="bat">
    <xsl:attribute name="id">123</xsl:attribute>
  </xsl:template>

</xsl:stylesheet>

Finally, this can be carried much further, using a different mode for each kind of edit you might want to make:

<xsl:stylesheet version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <!-- For <bat> elements, insert an "id" attribute -->
  <xsl:template mode="add-atts" match="bat">
    <xsl:attribute name="id">123</xsl:attribute>
  </xsl:template>

  <!-- Append <new-element/> to <bat> -->
  <xsl:template mode="append" match="bat">
    <new-element/>
  </xsl:template>

  <!-- Insert an element in <foo> content -->
  <xsl:template mode="insert" match="foo">
    <inserted/>
  </xsl:template>

  <!-- Add content before the <bar/> and <bat/> elements -->
  <xsl:template mode="before" match="bar | bat">
    <before-bat-and-bar/>
  </xsl:template>

  <!-- Add content only after <bat/> -->
  <xsl:template mode="after" match="bat">
    <after-bat/>
  </xsl:template>

  <!-- Here's the boilerplate code -->
  <!-- By default, copy all nodes unchanged -->
  <xsl:template match="@* | node()">
    <xsl:apply-templates mode="before" select="."/>
    <xsl:copy>
      <xsl:apply-templates select="@*"/>
      <xsl:apply-templates mode="add-atts" select="."/>
      <xsl:apply-templates mode="insert" select="."/>
      <xsl:apply-templates/>
      <xsl:apply-templates mode="append" select="."/>
    </xsl:copy>
    <xsl:apply-templates mode="after" select="."/>
  </xsl:template>

          <!-- By default, don't add anything -->
          <xsl:template mode="add-atts" match="*"/>
          <xsl:template mode="insert"   match="*"/>
          <xsl:template mode="append"   match="*"/>
          <xsl:template mode="before"   match="@* | node()"/>
          <xsl:template mode="after"    match="@* | node()"/>

</xsl:stylesheet>

In XSLT 2.0, some of the boilerplate can be slightly simplified, thanks to multi-mode template rules:

          <!-- By default, don't add anything -->
          <xsl:template mode="add-atts
                              insert
                              append
                              before
                              after" match="@* | node()"/>

I sometimes use all of these custom modes in the same stylesheet, but more often than not I add them lazily--as needed.

Evan Lenz
Sorry it took a while. This is exactly what I wanted (I think). I need to test it a little bit more.
Jay Stevens
A: 

In the past I have developed XSL-FO stylesheets and then used the Render-X FO2HTML stylesheet to convert the XSL-FO into HTML. It converts <block> elements into <div>, <inline> into <span>, etc.

I haven't used them before, but you might consider trying HTML2FO stylesheets. Or at least looking them over to borrow some ideas.

Since HTML is lacking some of the pagination constructs that FO provides it might not give you all of what you need for your XSL-FO output, but could probably handle the majority of the conversion logic from HTML to XSL-FO element/attributes in the body of the document.

Mads Hansen