tags:

views:

140

answers:

3

Ok, this one has been driving me up the wall...

I have a xslt function that is supposed to split out the Zip-code part from a Zip+City string depending on the country. I cannot get it to work! This is what I got so far:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/functions" xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;
<xsl:function name="exslt:GetZip" as="xs:string">
    <xsl:param name="zipandcity" as="xs:string"/>
    <xsl:param name="countrycode" as="xs:string"/>
    <xsl:choose>
        <xsl:when test="$countrycode='DK'">
            <xsl:analyze-string select="$zipandcity" regex="(\d{4}) ([A-Za-zÆØÅæøå]{3,24})">
                <xsl:matching-substring>
                    <xsl:value-of select="regex-group(1)"/>
                </xsl:matching-substring>
                <xsl:non-matching-substring>    
                        <xsl:text>fail</xsl:text>
                </xsl:non-matching-substring> 
            </xsl:analyze-string>               
        </xsl:when>
        <xsl:otherwise>
            <xsl:text>error</xsl:text>
        </xsl:otherwise>
    </xsl:choose>
</xsl:function>

I am running it on a source XML where the following values are passed to the function:

  • zipandcity: "DK-2640 København SV"
  • countrycode: "DK"

...will output 'fail'!

I think there is something I am misunderstanding here...

+2  A: 

Regular expressions are only supported in XSLT 2.x -- not in XSLT 1.0.

Dimitre Novatchev
Well spotted! I changed the version number in the code (as well as my snippet above here), but I still get the same results!
Fedor Steeman
Just changing the version number will not help, you need to use an XSLT 2.0 *processor*.
Dimitre Novatchev
I am using Stylus Studio...
Fedor Steeman
+1  A: 

The regex attribute is parsed as an attribute value template whery curly braces have a special meaning. If this is in fact an XSL 2.0 Stylesheet, you need to escape the curly braces in the regex attribute by doubling them: (\d{{4}}) ([A-Za-zÆØÅæøå]{{3,24}})

Alternatively you could define a variable containing your pattern like this:

<xsl:variable name="pattern">(\d{4}) ([A-Za-zÆØÅæøå]{3,24})</xsl:variable
<xsl:analyze-string select="$zipandcity" regex="{$pattern}">
Jörn Horstmann
+2  A: 

Aside from that facts that regexes aren't supported until XSLT 2.0 and braces have to be escaped (but backslashes don't), there's one more reason why that code won't work: XSLT regexes are implicitly anchored at both ends. Given the string DK-2640 København SV, your regex only matches 2640 København, so you need to "pad" it to make it consume the whole string:

regex=".*(\d{{4}}) ([A-Za-zÆØÅæøå]{{3,24}}).*"

.* is probably sufficient in this case, but sometimes you have to be more specific. For example, if there's more than one place where \d{4} could match, you might use \D* at the beginning to make sure the first capturing group matches the first bunch of digits.

Alan Moore
Excellent! This really did the trick for Danish postal codes at least. Now I will try to use your suggestion for the other countries as well. Thanks!
Fedor Steeman
Figured out some good working regexes for postal codes of countries like Sweden and the Netherlands too, thanks to your tip. Thanks again!
Fedor Steeman