tags:

views:

47

answers:

3

Given XML snippet of:

<forms>
<FORM lob="BO" form_name="AI OM 10"/>
<FORM lob="BO" form_name="CL BP 03 01"/>
<FORM lob="BO" form_name="AI OM 107"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="123 DDE"/>
<FORM lob="BO" form_name="CL BP 00 02"/>
<FORM lob="BO" form_name="AI OM 98"/>
</forms>

I need to sort the FORM nodes by form_name alphabetically so all the forms containing 'AI OM' in the form_name are grouped together and then within that they are in numeric order by the integers (same for other forms).

The form_name can be is open season as letters and numbers can be in any order:

XX ## ##
XX XX ##
XX XX ###
XX XX ## ##
XX ###
XX XXXX
'## XXX
XXX###

What I THINK needs to happen is that string needs to be split between alpha and numeric. The numeric part could probably be sorted with any spaces removed I suppose.

I am at a loss as to how to split the string and then cover all the sorting/grouping combinations given that there are no rules around the 'form_name' format.

We are using XSLT 2.0. Thanks.

+2  A: 

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="vDigits" select="'0123456789 '"/>
 <xsl:variable name="vAlpha" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ '"/>

 <xsl:template match="/*">
  <forms>
   <xsl:for-each select="FORM">
    <xsl:sort select="translate(@form_name,$vDigits,'')"/>
    <xsl:sort select="translate(@form_name,$vAlpha,'')"
        data-type="number"/>
    <xsl:copy-of select="."/>
   </xsl:for-each>
  </forms>
 </xsl:template>
</xsl:stylesheet>

when applied on the provided XML document:

<forms>
    <FORM lob="BO" form_name="AI OM 10"/>
    <FORM lob="BO" form_name="CL BP 03 01"/>
    <FORM lob="BO" form_name="AI OM 107"/>
    <FORM lob="BO" form_name="CL BP 00 02"/>
    <FORM lob="BO" form_name="123 DDE"/>
    <FORM lob="BO" form_name="CL BP 00 02"/>
    <FORM lob="BO" form_name="AI OM 98"/>
</forms>

produces the wanted, correct result:

<forms>
    <FORM lob="BO" form_name="AI OM 10"/>
    <FORM lob="BO" form_name="AI OM 98"/>
    <FORM lob="BO" form_name="AI OM 107"/>
    <FORM lob="BO" form_name="CL BP 00 02"/>
    <FORM lob="BO" form_name="CL BP 00 02"/>
    <FORM lob="BO" form_name="CL BP 03 01"/>
    <FORM lob="BO" form_name="123 DDE"/>
</forms>

Do note:

  1. Two <xsl:sort> instructions implement the two-phase sorting

  2. The XPath translate() function is used to produce either the alpha-only sort-key or the digits-only sort-key.

Dimitre Novatchev
@Dimitre: +1 This is compact, and it should be used if one can relay in number format (i.e, there is no `CL BP 03 03` and `CL BP 03 4`)
Alejandro
A: 

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="forms">
        <xsl:apply-templates>
            <xsl:sort select="normalize-space(
                                translate(@form_name,
                                          '0123456789',
                                          ''))"/>
            <xsl:sort select="substring-before(
                                concat(
                                  normalize-space(
                                    translate(@form_name,
                                              translate(@form_name,
                                                        '0123456789 ',
                                                        ''),
                                              '')),
                                  ' '),' ')" data-type="number"/>
            <xsl:sort select="substring-after(
                                normalize-space(
                                  translate(@form_name,
                                            translate(@form_name,
                                                      '0123456789 ',
                                                      ''),
                                            '')),
                                  ' ')" data-type="number"/>
        </xsl:apply-templates>
    </xsl:template>
</xsl:stylesheet>

Output:

<FORM lob="BO" form_name="AI OM 10"></FORM>
<FORM lob="BO" form_name="AI OM 98"></FORM>
<FORM lob="BO" form_name="AI OM 107"></FORM>
<FORM lob="BO" form_name="CL BP 00 02"></FORM>
<FORM lob="BO" form_name="CL BP 00 02"></FORM>
<FORM lob="BO" form_name="CL BP 03 01"></FORM>
<FORM lob="BO" form_name="123 DDE"></FORM>

XSLT 2.0 solution: this stylesheet

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema"&gt;
    <xsl:output method="xml" indent="yes"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="forms">
        <xsl:apply-templates>
            <xsl:sort select="string-join(tokenize(@form_name,' ')
                                            [not(. castable as xs:integer)],
                                          ' ')"/>
            <xsl:sort select="xs:integer(tokenize(@form_name,' ')
                                            [. castable as xs:integer][1])"/>
            <xsl:sort select="xs:integer(tokenize(@form_name,' ')
                                            [. castable as xs:integer][2])"/>
        </xsl:apply-templates>
    </xsl:template>
</xsl:stylesheet>
Alejandro
A: 

Thank you both for the answers. I am on vacation on another computer so I can't vote on the answer right; I should sign up for an account.

johkar