views:

168

answers:

2

I'm tokenising a string with XSLT 1.0 and trying to prevent empty strings from being recognised as tokens. Here's the entire function, based on XSLT Cookbook:

<xsl:template name="tokenize">
    <xsl:param name="string" select="''" />
    <xsl:param name="delimiters" select="';#'" />
    <xsl:param name="tokensplitter" select="','" />
    <xsl:choose>
     <!-- Nothing to do if empty string -->
     <xsl:when test="not($string)" />

     <!-- No delimiters signals character level tokenization -->
     <xsl:when test="not($delimiters)">
      <xsl:call-template name="_tokenize-characters">
       <xsl:with-param name="string" select="$string" />
       <xsl:with-param name="tokensplitter" select="$tokensplitter" />
      </xsl:call-template>
     </xsl:when>
     <xsl:otherwise>
      <xsl:call-template name="_tokenize-delimiters">
       <xsl:with-param name="string" select="$string" />
       <xsl:with-param name="delimiters" select="$delimiters" />
       <xsl:with-param name="tokensplitter" select="$tokensplitter" />
      </xsl:call-template>
     </xsl:otherwise>
    </xsl:choose>
</xsl:template>

<xsl:template name="_tokenize-characters">
    <xsl:param name="string" />
    <xsl:param name="tokensplitter" />
    <xsl:if test="$string">
     <token><xsl:value-of select="substring($string, 1, 1)"/></token>
     <xsl:call-template name="_tokenize-characters">
      <xsl:with-param name="string" select="substring($string, 2)" />
     </xsl:call-template>
    </xsl:if>
</xsl:template>

<xsl:template name="_tokenize-delimiters">
    <xsl:param name="string" />
    <xsl:param name="delimiters" />
    <xsl:param name="tokensplitter" />

    <!-- Extract a delimiter -->
    <xsl:variable name="delimiter" select="substring($delimiters, 1, 1)"/>
    <xsl:choose>
     <!-- If the delimiter is empty we have a token -->
     <xsl:when test="not($delimiter) and $string != ''">
      <xsl:text>£</xsl:text>
      <token><xsl:value-of select="$string"/></token>
      <xsl:text>$</xsl:text>
      <xsl:value-of select="$tokensplitter"/>
     </xsl:when>
     <!-- If the string contains at least one delimiter we must split it -->
     <xsl:when test="contains($string, $delimiter)">
      <!-- If it starts with the delimiter we don't need to handle the before part -->
      <xsl:if test="not(starts-with($string, $delimiter))">
       <!-- Handle the part that comes before the current delimiter with the next delimiter. -->
       <!-- If there is no next the first test in this template will detect the token. -->
       <xsl:call-template name="_tokenize-delimiters">
        <xsl:with-param name="string" select="substring-before($string, $delimiter)" />
        <xsl:with-param name="delimiters" select="substring($delimiters, 2)" />
        <xsl:with-param name="tokensplitter" select="$tokensplitter" />
       </xsl:call-template>
      </xsl:if>
      <!-- Handle the part that comes after the delimiter using the current delimiter -->
      <xsl:call-template name="_tokenize-delimiters">
       <xsl:with-param name="string" select="substring-after($string, $delimiter)" />
       <xsl:with-param name="delimiters" select="$delimiters" />
       <xsl:with-param name="tokensplitter" select="$tokensplitter" />
      </xsl:call-template>
     </xsl:when>
     <xsl:otherwise>
      <!-- No occurrences of current delimiter so move on to next -->
      <xsl:call-template name="_tokenize-delimiters">
       <xsl:with-param name="string" select="$string" />
       <xsl:with-param name="delimiters" select="substring($delimiters, 2)" />
       <xsl:with-param name="tokensplitter" select="$tokensplitter" />
      </xsl:call-template>
     </xsl:otherwise>
    </xsl:choose>
</xsl:template>

Value for string that I'm passing in is:

Europe;#6;#Global;#3;#Middle East, Africa and Caucasus;2;#Europe;#6;#Global;#3;#Middle East, Africa and Caucasus

(The £ and $ indicators are just there so I can see no empty strings are output. This is within SharePoint so is difficult to debug.)

This code hangs processing of the XSLT. The line causing the problem is <xsl:when test="not($delimiter) and $string != ''">. As soon as I remove the second and test it works again. I've also tried and string($string) without success.

Anyone know why this is happening and how to resolve it?

A: 

Isn't string a reserved word? Can you try to replace that name for anything else?

EDIT: Supplied code ran without problem here: XSLT Tryit Editor v1.0 using:

<xsl:call-template name="tokenize">
   <xsl:with-param name="string">Europe;#6;#Global...</xsl:with-param>
</xsl:call-template>
Rubens Farias
It's straight from the XSLT Cookbook and has been working until now... Does seem strange they would use that as a variable name.
Alex Angas
Can you please post more code, or a simple test case? I don't know that cookbook
Rubens Farias
Updated the question with complete code and example.
Alex Angas
maybe not; how do you detect that _infinite loop_?
Rubens Farias
+3  A: 

I believe my suspicion was correct: you're falling through to your <xsl:otherwise> clause when $string has a value, but $delimiter does not, causing an infinite loop, as you say.

Add the following new <xsl:when> clause after the first one:

    <xsl:when test="not($delimiter) and $string = ''" />

That will prevent the execution from entering the <xsl:otherwise> block when it shouldn't.


A more elaborate explanation of what's going on and why it's looping:

There are three branches in the <xsl:choose> block.

    <xsl:when test="not($delimiter) and $string != ''">
    <xsl:when test="contains($string, $delimiter)">
    <xsl:otherwise>

So, when neither $string nor $delimiter contain values, the first condition fails (because $string != '' is false). The second condition passes (because contains(nil,nil) always returns true (confirmed in Visual Studio)), which calls the template again with the same parameters (because the substring-before returns the empty string since it doesn't contain the empty delimiter). Ergo, an infinite loop.

The fix is to add a new, empty condition:

    <xsl:when test="not($delimiter) and $string != ''">
    <xsl:when test="not($delimiter) and $string = ''" />
    <xsl:when test="contains($string, $delimiter)">
    <xsl:otherwise>


EDIT: I've poked around and I can't find a reference to the defined behaviour of contains when the second parameter is empty or nil. Tests have shown that Microsoft Visual Studio's XSLT engine returns true when the second parameter is either empty or nil. I'm not sure if that's the defined behaviour or if it's up to the implementor to decide. Does anyone have a conclusive answer to this? Tomalak, I'm looking at you.

Welbog
Thank you, I can see the error here.
Alex Angas
@Alex Angas: No problem. Just make sure to never focus on just one section of an `if/elseif` block (or its equivalent in whatever language you are using). The entire block needs to be considered when debugging one part of it.
Welbog
+1 For an excellent clarification / explanation. Yes, I should know better.
Alex Angas
I've updated my answer to reflect some extra tests I did on my own.
Welbog
Revised it *again* due to *even more* testing. XSLT functions do a lot of things I don't expect on edge cases...
Welbog
Further inspections into XSLT's treatment of `nil` have driven me to madness. I no longer have any idea what the hell is going on when XPath functions deal with empty or nil strings. For the love of lasers, please run your own tests whenever you might have an empty string in XSLT because it doesn't make any sense. These are the worst string parsing functions I have ever seen. I've been working with them for years and only now have discovered the horrors that lie therein. **Be very, very careful when you deal with XSLT's handling of empty strings. It's insanely weird.**
Welbog
Will pack laser in future work with XSLT! Thanks Welbog.
Alex Angas