tags:

views:

149

answers:

1

I've come across the following snippet in an XSL file that I'm working with. The XSL is basically converting HTML tags and international content (characters with accents mostly) into a format digestable by QuarkXPress.

I'm not familiar with XSL at all and judging by the code it looks like we're checking some content against a regular expression, converting it if it matches, and if not we're passing it along to see if the next template can match it.

The approach seems OK to my untrained eyes but the XSL file is full of duplication.

There must be a cleaner way of writing this. Can you help me out?

Edit: Explaining duplication.

In the block below, I've got two blocks that are nearly identical, there are about 50 more of these blocks in the file. The only things that change between blocks are: the template name, the regex, the content of the matching-substring tag and what is called in the non-matching-substring block.

<!-- convert HTML <br> tag to ASCII/Quark new line tag -->
    <xsl:template name="break-tag">
     <xsl:param name="string" select="string(.)"/>
     <xsl:analyze-string select="$string" regex="&lt;br&gt;" flags="i">
      <xsl:matching-substring>
       <xsl:text disable-output-escaping="yes">&lt;\n&gt;</xsl:text>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
       <xsl:call-template name="open-list-tag"/> 
      </xsl:non-matching-substring>
     </xsl:analyze-string>
    </xsl:template>

<!-- convert HTML <li> tag to Bull Text stylesheet with bullet tag and tab tag -->
    <xsl:template name="open-list-tag">
     <xsl:param name="string" select="string(.)"/>
     <xsl:analyze-string select="$string" regex="&lt;li&gt;" flags="i">
      <xsl:matching-substring>
       <xsl:text disable-output-escaping="yes">@F07/2 Bullet Points:</xsl:text>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
       <xsl:call-template name="euro-char-entity"/>
      </xsl:non-matching-substring>
     </xsl:analyze-string>
    </xsl:template>
+3  A: 

XSL is a scripting language for XML, written as XML. Because it is XML, all instructions are XML tags, and likewise all opening XML tags need closing XML tags.

For XSL, this snippet is actually very concise.

Your summation of what the code does is correct. XSL is pretty easy to learn, but I'll clarify a few tags for you

<xsl:template name="break-tag">
...
</xsl:template>

An xsl:template is roughly equivalent to a function

<xsl:param name="string" select="string(.)"/>

This is an input parameter to the template/function. String(.) is the text that is currently in scope when this template is called

<xsl:analyze-string select="$string" regex="&lt;br&gt;" flags="i">
    <xsl:matching-substring>
     <xsl:text disable-output-escaping="yes">&lt;\n&gt;</xsl:text>
    </xsl:matching-substring>
    <xsl:non-matching-substring>
     <xsl:call-template name="open-list-tag"/> 
    </xsl:non-matching-substring>
</xsl:analyze-string>

This compares the string against a regular expression. If there is a match, then "<\n>" is written to the output. If there isn't a match, then another template (open-list-tag) is called.


Edit - On duplication

XSL has a switch/case instruction as well as expanded regex functionality. You might be able to modify this to do what you need:

<xsl:choose>
   <xsl:when test="matches(string(.),'&lt;br&gt;')">
     <xsl:text disable-output-escaping="yes">&lt;\n&gt;</xsl:text>
   </xsl:when>
   <xsl:when test="matches(string(.),'&lt;li&gt;')">
     <xsl:text disable-output-escaping="yes">@F07/2 Bullet Points:</xsl:text>
   </xsl:when>
   <xsl:otherwise>
     <xsl:text>Unkown Tag: <xsl:value-of select="string(.)"/></xsl:text>
   </xsl:otherwise>
</xsl:choose>
Zack
Beautifully detailed and helpful answer!
Carl Smotricz
Thanks for the explanation Zack. Do you have any advice, now that I've edited the question, on how to cut down on the duplication?
StevenWilkins
See edits above
Zack
That's a beauty Zack, exactly what I'm looking for.
StevenWilkins