tags:

views:

49

answers:

2

I'm looking to take a delimited list of category IDs provided in one element...

<Categories>851|849</Categories>
<MatchType>any</MatchType>

...and use them to style other elements...

<Page CategoryIds="848|849|850|851">Page 1</Page>
<Page CategoryIds="849|850|">Page 2</Page>
<Page CategoryIds="848|850|">Page 3</Page>
<Page CategoryIds="848|849|850|851">Page 4</Page>
<Page CategoryIds="848|850|851">Page 5</Page>
<Page CategoryIds="848|849|850">Page 6</Page>

...based on whether or not they possess any (or all... depending on what's indicated in <MatchType>) of the given IDs.

Also, the IDs aren't necessarily going to be given in the order that they appear in the CategoryIds attribute, and the string inside the attribute isn't expected to contain the exact <Categories> string.

Is something like this possible using XSLT/XPath 1.0? I know that 2.0 has a tokenizing function that would be perfect for this, but unfortunately the CMS I am working with does not yet support 2.0.

Any help would be greatly appreciated!!

+1  A: 

While it is not clear at all from this question what you want to do, here is an answer to a part of it:

Is something like this possible using XSLT/XPath 1.0? I know that 2.0 has a tokenizing function that would be perfect for this, but unfortunately the CMS I am working with does not yet support 2.0.

Tokenization has been done in XSLT 1.0 for many years. While it is possible to write your own recursive teplate for tokenizing a string, it is good to remember that such a solution already is available in the FXSL library and it is guaranteed to work, is more powerful than the typical tokenization implemented and has no known bugs -- just ready to use.

This is the str-split-to-words template and here is one typical example of using it:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common"&gt;

   <xsl:import href="strSplit-to-Words.xsl"/>

   <xsl:output indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
      <xsl:variable name="vwordNodes">
        <xsl:call-template name="str-split-to-words">
          <xsl:with-param name="pStr" select="/"/>
          <xsl:with-param name="pDelimiters" 
                          select="', &#9;&#10;&#13;'"/>
        </xsl:call-template>
      </xsl:variable>

      <xsl:apply-templates select="ext:node-set($vwordNodes)/*"/>
    </xsl:template>

    <xsl:template match="word">
      <xsl:value-of select="concat(position(), ' ', ., '&#10;')"/>
    </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document:

<t>Sorry, kid, first-borns really are smarter.
First-borns are typically smarter, while
younger siblings get better grades and
are more outgoing, the researchers say</t>

the wanted, correct result is produced:

1 Sorry
2 kid
3 first-borns
4 really
5 are
6 smarter.
7 First-borns
8 are
9 typically
10 smarter
11 while
12 younger
13 siblings
14 get
15 better
16 grades
17 and
18 are
19 more
20 outgoing
21 the
22 researchers
23 say

Do note that the template accepts a parameter named pDelimiters in which multiple delimiters can be specified.

Update: I finally understood what the OP wants with this problem. Here is my solution, which again uses the str-split-to-words template for tokenization:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common"
>

   <xsl:import href="strSplit-to-Words.xsl"/>

   <!-- to be applied upon: test-strSplit-to-Words2.xml -->

   <xsl:output indent="yes" omit-xml-declaration="yes"/>

    <xsl:template match="/">
      <xsl:variable name="vCategories">
        <xsl:call-template name="str-split-to-words">
          <xsl:with-param name="pStr" select=
          "/*/select-criteria/Categories"/>
          <xsl:with-param name="pDelimiters" 
                          select="'|'"/>
        </xsl:call-template>
      </xsl:variable>

      <xsl:apply-templates select="*/pages/Page">
        <xsl:with-param name="pCategories" select=
         "ext:node-set($vCategories)"/>
        <xsl:with-param name="pMatchType" select=
        "*/select-criteria/MatchType"/>
      </xsl:apply-templates>
    </xsl:template>

    <xsl:template match="Page">
     <xsl:param name="pCategories"/>
     <xsl:param name="pMatchType" select="any"/>

     <xsl:variable name="vDecoratedCurrent"
          select="concat('|', @CategoryIds, '|')"/>

     <xsl:variable name="vSelected" select=
      "$pCategories/*
                [$pMatchType = 'any']
                   [contains($vDecoratedCurrent,
                             concat('|', ., '|')
                              )
                   ][1]

       or
        not($pCategories/*[not(contains($vDecoratedCurrent,
                                        concat('|', ., '|')
                                        )
                               )
                          ][1]
            )
       "/>

       <xsl:copy-of select="self::node()[$vSelected]"/>
    </xsl:template>
</xsl:stylesheet>

when this transformation is applied on this XML document:

<t>
 <select-criteria>
  <Categories>851|849</Categories>
  <MatchType>any</MatchType>
 </select-criteria>
 <pages>
  <Page CategoryIds="848|849|850|851">Page 1</Page>
  <Page CategoryIds="849|850|">Page 2</Page>
  <Page CategoryIds="848|850|">Page 3</Page>
  <Page CategoryIds="848|849|850|851">Page 4</Page>
  <Page CategoryIds="848|850|851">Page 5</Page>
  <Page CategoryIds="848|849|850">Page 6</Page>
 </pages>
</t>

the wanted, correct result is produced:

<Page CategoryIds="848|849|850|851">Page 1</Page>
<Page CategoryIds="849|850|">Page 2</Page>
<Page CategoryIds="848|849|850|851">Page 4</Page>
<Page CategoryIds="848|850|851">Page 5</Page>
<Page CategoryIds="848|849|850">Page 6</Page>

When in the XML document we specify:

  <MatchType>all</MatchType>

we again get the wanted, correct result:

<Page CategoryIds="848|849|850|851">Page 1</Page>
<Page CategoryIds="848|849|850|851">Page 4</Page>
Dimitre Novatchev
@Dimitre: +1 Nice solution! This is how to do it when there is posible to tokenize in node set by means of extension. Very good.
Alejandro
+1  A: 

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:variable name="vMatch">
        <Categories>851|849</Categories>
        <MatchType>any</MatchType>
    </xsl:variable>
    <xsl:param name="pMatch" select="document('')/*/xsl:variable[@name='vMatch']"/>
    <xsl:template match="@*|node()" name="identity">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="Page" name="page">
        <xsl:param name="pCategories" select="$pMatch/Categories"/>
        <xsl:if test="$pCategories != ''">
            <xsl:variable name="vTest" select="contains(concat('|',
                                                                   @CategoryIds,
                                                                   '|'),
                                                            concat('|',
                                                                   substring-before(concat($pCategories,
                                                                                           '|'),
                                                                                    '|'),
                                                                   '|'))"/>
            <xsl:choose>
                <xsl:when test="$vTest and ($pMatch/MatchType = 'any' or
                                            substring-after($pCategories,
                                                            '|')
                                            = '')">
                    <xsl:call-template name="identity"/>
                </xsl:when>
                <xsl:when test="($vTest and $pMatch/MatchType = 'all') or
                                $pMatch/MatchType = 'any' ">
                    <xsl:call-template name="page">
                        <xsl:with-param name="pCategories" select="substring-after($pCategories,'|')"/>
                    </xsl:call-template>
                </xsl:when>
            </xsl:choose>
        </xsl:if>
    </xsl:template>
</xsl:stylesheet>

Whit this input:

<Pages>
    <Page CategoryIds="848|849|850|851">Page 1</Page>
    <Page CategoryIds="849|850|">Page 2</Page>
    <Page CategoryIds="848|850|">Page 3</Page>
    <Page CategoryIds="848|849|850|851">Page 4</Page>
    <Page CategoryIds="848|850|851">Page 5</Page>
    <Page CategoryIds="848|849|850">Page 6</Page>
</Pages>

Output:

<Pages>
    <Page CategoryIds="848|849|850|851">Page 1</Page>
    <Page CategoryIds="849|850|">Page 2</Page>
    <Page CategoryIds="848|849|850|851">Page 4</Page>
    <Page CategoryIds="848|850|851">Page 5</Page>
    <Page CategoryIds="848|849|850">Page 6</Page>
</Pages>

Note: Because I don't know where you get your Categories to test, I put those inline in the stylesheet. This has some optimization: after testing first category, success (call template identity) if category is found and match type is any or it's the last category to test, otherwise it makes a recursive call only if category is found and match type is all or category is not found and match type is any. So, it success with first match in any "mode" and fails with first fail in all "mode".

Edit: Just for fun, with Dimitre's input:

<t>
 <select-criteria>
  <Categories>851|849</Categories>
  <MatchType>all</MatchType>
 </select-criteria>
 <pages>
  <Page CategoryIds="848|849|850|851">Page 1</Page>
  <Page CategoryIds="849|850">Page 2</Page>
  <Page CategoryIds="848|850">Page 3</Page>
  <Page CategoryIds="848|849|850|851">Page 4</Page>
  <Page CategoryIds="848|850|851">Page 5</Page>
  <Page CategoryIds="848|849|850">Page 6</Page>
 </pages>
</t>

One line XPath 2.0:

/t/*/Page[(
           /t/*/MatchType = 'any' 
                   and 
           tokenize(/t/*/Categories,'\|') = tokenize(@CategoryIds,'\|')
          ) or (
           /t/*/MatchType = 'all' 
                   and 
           (every $x in tokenize(/t/*/Categories,'\|') 
            satisfies $x = tokenize(@CategoryIds,'\|'))
          )]

With XPath 2.1 let expression, it would be less verbose...

Alejandro
@Alejandro: Why is this complicated effort to produce the same results as the identity transform? How did you decide what the OP really wanted? Please, explain this mystery to me. :)
Dimitre Novatchev
@Dimitre: First, because it was fun. Second, it's almost the indentity transform (missing Page 3), but of course you had already seen that. With less effort I had built a transformation with a `pState` param (initial false for "any", initial true for "all") changing with each recursion (or-logic with "any", and-logic with "all"). It was compact but iterating over all the tokens. It took me several minutes to came up with this optimizated version. Of course that my process does not make anything else that coping... But it was your answer what attracted me to respond! Ja!
Alejandro
@Alejandro: Sorry, I didn't see the small difference. Thanks to this now I finally understood what the OP wanted and produced a solution, which uses a single XPath expression after the tokenization of the categories in the match criteria.
Dimitre Novatchev
@Dimitre: Of topic question: I was studing XPath 2.1 WD and it makes me think about why argument "sequence" for function invocation is treated special (say `$x(1 to 3)` it would be interpreted as `$x((1,2,3))` - one sequence argument - and not as `$x(1,2,3)` - a p/3 invocation -). But maybe this is because it would need nested sequence handling.
Alejandro
@Alejandro: `$f(someExpression)` clearly has just one argument. If a function invocation has more than one argument, each argument must be delimited from its neighbors by the comma character. Therefore, in `$x(1 to 3)` there is exactly one argument -- the sequence `1 to 3`. And yes, there isn't anything like a nested sequence datatype or a tuple datatype in XPath -- yet.
Dimitre Novatchev
@Alejandro: Did you know that there is a SO chat? http://chat.meta.stackoverflow.com/
Dimitre Novatchev