views:

236

answers:

3

How would I convert the following using xslt

<blogger>
  <post>
    <text>...</text>
    <categories>Engineering, Internet, Sausages</catgories>
  </post>
  <post>
    <text>...</text>
    <categories>Internet, Sausages</catgories>
  </post>
  <post>
     <text>...</text>
     <categories>Sausages</catgories>
  </post>
</blogger>

into

   Sausages (3)
   Internet (2)
   Engineering (1)
+2  A: 

First, change your xml

create data.xml

<blogger>
 <post>
     <text>...</text>
     <categories>
       <category>Engineering</category>
       <category>Internet</category>
       <category>Sausages</category>
     </categories>          
</post>
 <post>
     <text>...</text>
      <categories>
       <category>Internet</category>
       <category>Sausages</category>
      </categories>     
  </post>
 <post>
     <text>...</text>
     <categories>
      <category>Sausages</category>
     </categories>
 </post>
</blogger>

Then write your xslt, create transform.xslt

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" 
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
<xsl:output method="text"/>

<xsl:template match="/">

  <xsl:for-each select="//category">
   <xsl:variable name="value" select="."/>
   <xsl:if test="count(preceding::category[.=$value]) = 0">
    <xsl:value-of select="."/>
    <xsl:text> (</xsl:text>
    <xsl:value-of select="count(//category[.=$value])"/>     
    <xsl:text>)</xsl:text><br/>
   </xsl:if>
 </xsl:for-each>

 </xsl:template>
</xsl:stylesheet>

Then you can open data.xml in internet explorer and get the following result:

Engineering (1)Internet (2)Sausages (3)
Makach
Unfortunately I can't change the structure of the XML as it comes from elsewhere.
Skiltz
It's not logical (for me atleast) to have your existing XML structure. Can you ask your xml source provider to change their XML? Otherwise you could preprocess/refine your XML to get what you need to run the required xslt.
Makach
thats cool...just one last think wanted to sort them highest to lowest? 3 2 1 etc
Skiltz
Awww, sorting them wouldn't be that difficult either but I'll keep that as an exercise for others. :-)
Workshop Alex
+2  A: 

What you need is this:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>

  <xsl:template match="/">
    <items>
      <xsl:apply-templates select="/blogger/post/categories" />
    </items>
  </xsl:template>

  <xsl:template match="categories">
    <xsl:call-template name="split">
      <xsl:with-param name="pString" select="." />
    </xsl:call-template>
  </xsl:template>

  <!-- this splits a comma-delimited string into a series of <item>s -->
  <xsl:template name="split">
    <xsl:param name="pString" select="''" />

    <xsl:variable name="vList" select="
      concat($pString, ',')
    " />
    <xsl:variable name="vHead" select="
      normalize-space(substring-before($vList ,','))
    " />
    <xsl:variable name="vTail" select="
      normalize-space(substring-after($vList ,','))
    " />

    <xsl:if test="not($vHead = '')">
      <item>
        <xsl:value-of select="$vHead" />
      </item>
      <xsl:call-template name="split">
        <xsl:with-param name="pString" select="$vTail" />
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

Which produces this intermediary result:

<items>
  <item>Engineering</item>
  <item>Internet</item>
  <item>Sausages</item>
  <item>Internet</item>
  <item>Sausages</item>
  <item>Sausages</item>
</items>

And this:

<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>

  <xsl:output method="text" />
  <xsl:key name="kItem" match="item" use="." />

  <xsl:template match="/items">
    <xsl:apply-templates select="item">
      <xsl:sort 
        select="count(key('kItem', .))" 
        data-type="number" 
        order="descending"
      />
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="item">
    <xsl:if test="
      generate-id() = generate-id(key('kItem', .)[1])
    ">
      <xsl:value-of select="
        concat(
          ., ' (', count(key('kItem', .)), ')&#10;'
        )
      " />
    </xsl:if>
  </xsl:template>

</xsl:stylesheet>

Which outputs:

Sausages (3)
Internet (2)
Engineering (1)
Tomalak
+1  A: 

Actually, it can be done and isn't difficult either. This will do what you want it to do:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="fo msxsl">
  <xsl:output encoding="UTF-8" indent="yes" method="xml"/>
  <xsl:variable name="Separator">,</xsl:variable>
  <xsl:template match="/">
    <xsl:variable name="NodeList">
      <xsl:apply-templates select="//categories"/>
    </xsl:variable>
    <xsl:variable name="Nodes" select="msxsl:node-set($NodeList)"/>
    <html>
      <head>
        <title>Simple list</title>
      </head>
      <body>
        <xsl:for-each select="$Nodes/Value">
          <xsl:variable name="value" select="."/>
          <xsl:if test="count(preceding::Value[.=$value]) = 0">
            <xsl:value-of select="."/> (<xsl:value-of select="count($Nodes/Value[.=$value])"/>)<br/>
          </xsl:if>
        </xsl:for-each>
      </body>
    </html>
  </xsl:template>
  <xsl:template match="categories" name="Whole">
    <xsl:call-template name="Substring">
      <xsl:with-param name="Value" select="normalize-space(.)"/>
    </xsl:call-template>
  </xsl:template>
  <xsl:template name="Substring">
    <xsl:param name="Value"/>
    <xsl:choose>
      <xsl:when test="contains($Value, $Separator)">
        <xsl:variable name="Before" select="normalize-space(substring-before($Value, $Separator))"/>
        <xsl:variable name="After" select="normalize-space(substring-after($Value, $Separator))"/>
        <Value>
          <xsl:value-of select="$Before"/>
        </Value>
        <xsl:call-template name="Substring">
          <xsl:with-param name="Value" select="$After"/>
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <Value>
          <xsl:value-of select="$Value"/>
        </Value>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>
</xsl:stylesheet>

Actually, it's a piece of cake. :-)

Workshop Alex
Must add that this code works with MXSML. If you use another XSLT processor then you need another solution to transform a variable to a node-set. (Although some processors don't need such convertions.)
Workshop Alex
But that's not sorting on the group count either - a two-step operation is needed to do that.
Tomalak
Oh, I just noticed: the template name ("Whole") is unnecessary.
Tomalak
Grouping on the group count isn't difficult. Just add them to a new node-set again. My stylesheet is already operating in a two-step way. First it splits the strings and stores the result in a node-set. Then it counts the elements in this node-set. You could add that to a second node-set and sort that one on count.Basically, you can do two steps within a single stylesheet...
Workshop Alex
Hm... Yeah, you could do that. Maybe I'm just too fixed on keys when it comes to grouping and doing things without extension functions in general. ;-) For small inputs "node-set()" and "count(preceding...)" is probably fast enough, but I would expect it to scale very badly. Anyway. +1 from me. ^^
Tomalak