views:

95

answers:

3

Hi there,

I've got an interesting XSL scenario to run by you guys. So far my solutions seem to be inefficient (noticable increase in transformation time) so thought I'd put it out there.

The scenario From the following XML we need to get the id of latest news item for each category.

The XML In the XML I have a list of news items, a list of news categories and a list of item category relationships. Both the item list and item category list may as well be in random order (not date ordered).

<news>

    <itemlist>
     <item id="1">
      <attribute name="title">Great new products</attribute>
      <attribute name="startdate">2009-06-13T00:00:00</attribute>
     </item>
     <item id="2">
      <attribute name="title">FTSE down</attribute>
      <attribute name="startdate">2009-10-01T00:00:00</attribute>
     </item>
     <item id="3">
      <attribute name="title">SAAB go under</attribute>
      <attribute name="startdate">2008-01-22T00:00:00</attribute>
     </item>
     <item id="4">
      <attribute name="title">M&amp;A on increase</attribute>
      <attribute name="startdate">2010-05-11T00:00:00</attribute>
     </item>
    </itemlist>

    <categorylist>
     <category id="1">
      <name>Finance</name>
     </category>
     <category id="2">
      <name>Environment</name>
     </category>
     <category id="3">
      <name>Health</name>
     </category>
    </categorylist>

    <itemcategorylist>
     <itemcategory itemid="1" categoryid="2" />
     <itemcategory itemid="2" categoryid="3" />
     <itemcategory itemid="3" categoryid="1" />
     <itemcategory itemid="4" categoryid="1" />
     <itemcategory itemid="4" categoryid="2" />
     <itemcategory itemid="2" categoryid="2" />
    </itemcategorylist>

</news>

What I've tried Using rtf

<xsl:template match="/">

     <!-- for each category -->
     <xsl:for-each select="/news/categorylist/category">

      <xsl:variable name="categoryid" select="@id"/>

      <!-- create RTF item list containing only items in that list ordered by startdate -->
      <xsl:variable name="ordereditemlist">
       <xsl:for-each select="/news/itemlist/item">
        <xsl:sort select="attribute[@name='startdate']" order="descending" data-type="text"/>
        <xsl:variable name="itemid" select="@id" />
        <xsl:if test="/news/itemcategorylist/itemcategory[@categoryid = $categoryid][@itemid=$itemid]">
         <xsl:copy-of select="."/>
        </xsl:if>
       </xsl:for-each>
      </xsl:variable>

      <!-- get the id of the first item in the list -->
      <xsl:variable name="firstitemid" select="msxsl:node-set($ordereditemlist)/item[position()=1]/@id"/>

     </xsl:for-each>

    </xsl:template>

Would really appreciate any ideas you have.

Thanks, Alex

+1  A: 

Your're looping through all items and sorting them by date, before you throw most of them away due to not being in the correct category.

Maybe something like this might be more suitable in your case:

<xsl:variable name="ordereditemlist">
    <xsl:for-each select="/news/itemcategorylist/itemcategory[@categoryid = $categoryid]">
         <xsl:variable name="itemid" select="@itemid"/>

And continue from there to gather only the news items that you actually require, then sort and copy them.

Frank
+2  A: 

It looks like you should explore <xsl:key>. This effectively creates a hashmap and avoids looping through everything.

update Here is a typical tutorial:

http://www.learn-xslt-tutorial.com/Working-with-Keys.cfm

peter.murray.rust
Definitely on the right track. Thanks.
Alexander Bobin
+4  A: 

Here is how I would do it:

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <xsl:output encoding="utf-8" />

  <!-- this is (literally) the key to the solution -->    
  <xsl:key name="kItemByItemCategory" match="item" use="
    /news/itemcategorylist/itemcategory[@itemid = current()/@id]/@categoryid
  " />

  <xsl:template match="/news">
    <latest>
      <xsl:apply-templates select="categorylist/category" mode="latest" />
    </latest>
  </xsl:template>

  <xsl:template match="category" mode="latest">
    <xsl:variable name="self" select="." />
    <!-- sorted loop to get the latest news item -->
    <xsl:for-each select="key('kItemByItemCategory', @id)">
      <xsl:sort select="attribute[@name='startdate']" order="descending" />
      <xsl:if test="position() = 1">
        <category name="{$self/name}">
          <xsl:apply-templates select="." />
        </category>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>

  <xsl:template match="item">
    <!-- for the sake of the example, just copy the node -->
    <xsl:copy-of select="." />
  </xsl:template>

</xsl:stylesheet>

The <xsl:key> indexes each news item by the associated category ID. Now you have a simple way of retrieving all the news items that belong to a certain category. The rest is straight-forward.

Output for me:

<latest>
  <category name="Finance">
    <item id="4">
      <attribute name="title">M&amp;A on increase</attribute>
      <attribute name="startdate">2010-05-11T00:00:00</attribute>
    </item>
  </category>
  <category name="Environment">
    <item id="4">
      <attribute name="title">M&amp;A on increase</attribute>
      <attribute name="startdate">2010-05-11T00:00:00</attribute>
    </item>
  </category>
  <category name="Health">
    <item id="2">
      <attribute name="title">FTSE down</attribute>
      <attribute name="startdate">2009-10-01T00:00:00</attribute>
    </item>
  </category>
</latest>
Tomalak
You sir, are a genius. Went from 5:00 min transformation to 1:47!
Alexander Bobin