tags:

views:

54

answers:

2

I would like to have a list sorted ignoring any initial definite/indefinite articles 'the' and 'a'. For instance:

  • The Comedy of Errors
  • Hamlet
  • A Midsummer Night's Dream
  • Twelfth Night
  • The Winter's Tale

I think perhaps in XSLT 2.0 this could be achieved along the lines of:

<xsl:template match="/">
  <xsl:for-each select="play"/>
    <xsl:sort select="if (starts-with(title, 'A ')) then substring(title, 2) else
                      if (starts-with(title, 'The ')) then substring(title, 4) else title"/>
    <p><xsl:value-of select="title"/></p>
  </xsl:for-each>
</xsl:template>

However, I want to use in-browser processing, so have to use XSLT 1.0. Is there any way to achieve this in XLST 1.0?

+2  A: 

Here is how I would do that:

<xsl:template match="plays">
    <xsl:for-each select="play">
      <xsl:sort select="substring(title, 1 + 2*starts-with(title, 'A ') + 4*starts-with(title, 'The '))"/>
      <p>
        <xsl:value-of select="title"/>
      </p>
    </xsl:for-each>
</xsl:template>

Update: I forgot to add 1 to the expression (classic off-by-one error)

Well, starts-with is from XSLT 1.0. Prooflink: the first search result in Google yields XSLT 1.0: function starts-with

Gart
The problem is that `starts-with` is XSLT 2.0, I think - *Edit* I stand corrected!
Phil
@Gart - I corrected myself as soon as i hit post 23 minutes ago, I just didn't want to remove the original comment because it seems suspicious to delete one's own mistakes. Cheers.
Phil
Excellent simple, creative solution Gart – Thanks!
ChrisV
+2  A: 

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="plays">
     <p>Plays sorted by title: </p>
        <xsl:for-each select="play">
          <xsl:sort select=
          "concat(substring-after(@title, 'The '),
                  substring-after(@title, 'A '),
                  @title
                  )
         "/>
          <p>
            <xsl:value-of select="@title"/>
          </p>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

when applied on this XML document:

<t>
 <plays>
  <play title="The Comedy of Errors"/>
  <play title="Twelfth Night"/>
  <play title="A Midsummer Night's Dream"/>
  <play title="The Winter's Tale"/>
  <play title="Hamlet"/>
 </plays>
</t>

produces the wanted, correct result:

<p>Plays sorted by title: </p>

<p>The Comedy of Errors</p>
<p>Hamlet</p>
<p>A Midsummer Night's Dream</p>
<p>Twelfth Night</p>
<p>The Winter's Tale</p>
Dimitre Novatchev
A second excellent solution, Dimitre, which avoids the maths of Gart's solution – thanks also! Since I had to think about it slightly more, I'll add for others' benefit: for 'The Winter's Tale', for example, the value it will sort on will be 'Winter's TaleThe Winter's Tale'.
ChrisV
@ChrisV: Yes, the advantage of this solution is that it is simpler and presents less chances for committing mistakes in doing arithmetics. Regardless that the sort keys seem strange, this doesn't affect the values that are finally output.
Dimitre Novatchev
There may be issues if "The " is the middle of the text, more like "...Something The Something.." (sorry, can't think of any real example). Also, in general case the article "An " should also be treated accordingly.
Gart
@Gart: Certainly. But this question strictly says the articles are at the start of the play -- it's meaningless to eliminate any mid-word for the purpose of sorting.
Dimitre Novatchev