views:

70

answers:

2

Hi!

I would like to categorize results from an XPath under headings by element name (and then by same attribute names). Note: XML data could be inconsistent and some elements with the same name could have different attributes, therefore they need different headings.

I can't seem to write out my problem in words, so it might be best to use an example..

XML:

<pets>  
    <dog name="Frank" cute="yes" color"brown" type="Lab"/>
    <cat name="Fluffy" cute="yes" color="brown"/>
    <cat name="Lucy" cute="no" color="brown"/>
    <dog name="Spot" cute="no" color="brown"/>
    <dog name="Rover" cute="yes" color="brown"/>
    <dog name="Rupert" cute="yes" color="beige" type="Pug"/>
    <cat name="Simba" cute="yes" color="grey"/>
    <cat name="Princess" color="brown"/>

</pets>

XPath:

//*[@color='brown']

What the output should sort of look like (with the different headings for different elements):

ElementName  Color   Cute     Name     Type   
Dog          Brown   Yes      Frank    Lab


ElementName  Color   Cute     Name       
Dog          Brown   No       Spot    
Dog          Brown   Yes      Rover


ElementName  Color   Cute     Name     
Cat          Brown   Yes      Fluffy    
Cat          Brown   No       Lucy



ElementName  Color   Name     
Cat          Brown   Princess  

The XSL I currently have (simplified!):

<xsl:apply-templates select="//*[@color='brown']" mode="result">
    <xsl:sort select="name()" order="ascending"/>
</xsl:apply-templates>


<xsl:template match="@*|node()" mode="result">
    <tr>
        <th align="left">Element</th>

        <xsl:for-each select="@*">
            <xsl:sort select="name()" order="ascending"/>
            <th align="left">
                <xsl:value-of select="name()"/>
            </th>
        </xsl:for-each>
    </tr>

    <tr>
        <td align="left">
            <xsl:value-of select="name()"/>
         </td>
         <xsl:for-each select="@*">
             <xsl:sort select="name()" order="ascending"/>
             <td align="left">
                 <xsl:value-of select="."/>
             </td>
         </xsl:for-each>
     </tr>
</xsl:template>

This above XSL sorts them correctly in the way I want.. but now I need some sort of check to see which elements have the same name, and then if they have the same name, do they have the same attributes. Once I complete this check, I can then put general "Headings" above sets of records with matching element name and attributes.

I figured I could use xsl:choose xsl:when and do some tests. I was thinking (after the correct ordering has been done):

If element name != previous element name
    create headings
Else if all attributes != all previous element's attributes
    create headings

I guess my biggest problem is, is that I don't know how to check what the previous returned data set was... Can someone please tell me how to do this?

Or if I am approaching this wrong.. lead me to a better solution?

Hope that all made sense! Let me know if you need clarification!

Thanks in advance for your patience and responses! :)

A: 

This stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:key name="ByName-AttNum" match="*/*[@color='brown']" use="concat(name(),'++',count(@*))"/>
    <xsl:template match="/">
        <html>
            <xsl:apply-templates/>
        </html>
    </xsl:template>
    <xsl:template match="*/*[generate-id(.) = generate-id(key('ByName-AttNum',concat(name(),'++',count(@*)))[1])]">
        <table>
            <tr>
                <th>ElementName</th>
                <xsl:apply-templates select="@*" mode="headers">
                    <xsl:sort select="name()"/>
                </xsl:apply-templates>
            </tr>
            <xsl:apply-templates select="key('ByName-AttNum',concat(name(),'++',count(@*)))" mode="list"/>
        </table>
    </xsl:template>
    <xsl:template match="*" mode="list">
        <tr>
            <td>
                <xsl:value-of select="name()"/>
            </td>
            <xsl:apply-templates select="@*" mode="list">
                <xsl:sort select="name()"/>
            </xsl:apply-templates>
        </tr>
    </xsl:template>
    <xsl:template match="@*" mode="headers">
        <th>
            <xsl:value-of select="name()"/>
        </th>
    </xsl:template>
    <xsl:template match="@*" mode="list">
        <td>
            <xsl:value-of select="."/>
        </td>
    </xsl:template>
</xsl:stylesheet>

Result:

<html>
    <table>
        <tr>
            <th>ElementName</th>
            <th>color</th>
            <th>cute</th>
            <th>name</th>
            <th>type</th>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>yes</td>
            <td>Frank</td>
            <td>Lab</td>
        </tr>
    </table>
    <table>
        <tr>
            <th>ElementName</th>
            <th>color</th>
            <th>cute</th>
            <th>name</th>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td>yes</td>
            <td>Fluffy</td>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td>no</td>
            <td>Lucy</td>
        </tr>
    </table>
    <table>
        <tr>
            <th>ElementName</th>
            <th>color</th>
            <th>cute</th>
            <th>name</th>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>no</td>
            <td>Spot</td>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>yes</td>
            <td>Rover</td>
        </tr>
    </table>
    <table>
        <tr>
            <th>ElementName</th>
            <th>color</th>
            <th>name</th>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td>Princess</td>
        </tr>
    </table>
</html>

Note: This assumes that all elements having the same number of attributes have also the same attribute's name (like in your input sample).

EDIT: Better ouput markup.

EDIT 2: Another kind of solution: one header with all posible attribute (like CSV pattern) and order element by attribute count and name.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:key name="attrByName" match="pets/*/@*" use="name()"/>
    <xsl:variable name="attr" select="/pets/*/@*[count(.|key('attrByName',name())[1])=1]"/>
    <xsl:template match="pets">
        <html>
            <table>
                <tr>
                    <th>ElementName</th>
                    <xsl:apply-templates select="$attr" mode="headers">
                        <xsl:sort select="name()"/>
                    </xsl:apply-templates>
                </tr>
                <xsl:apply-templates select="*[@color='brown']">
                    <xsl:sort select="count(@*)" order="descending"/>
                    <xsl:sort select="name()"/>
                </xsl:apply-templates>
            </table>
        </html>
    </xsl:template>
    <xsl:template match="pets/*">
        <tr>
            <td>
                <xsl:value-of select="name()"/>
            </td>
            <xsl:apply-templates select="$attr" mode="list">
                <xsl:sort select="name()"/>
                <xsl:with-param name="node" select="."/>
            </xsl:apply-templates>
        </tr>
    </xsl:template>
    <xsl:template match="@*" mode="headers">
        <th>
            <xsl:value-of select="name()"/>
        </th>
    </xsl:template>
    <xsl:template match="@*" mode="list">
        <xsl:param name="node"/>
        <td>
            <xsl:value-of select="$node/@*[name()=name(current())]"/>
        </td>
    </xsl:template>
</xsl:stylesheet>

Result:

<html>
    <table>
        <tr>
            <th>ElementName</th>
            <th>color</th>
            <th>cute</th>
            <th>name</th>
            <th>type</th>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>yes</td>
            <td>Frank</td>
            <td>Lab</td>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td>yes</td>
            <td>Fluffy</td>
            <td></td>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td>no</td>
            <td>Lucy</td>
            <td></td>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>no</td>
            <td>Spot</td>
            <td></td>
        </tr>
        <tr>
            <td>dog</td>
            <td>brown</td>
            <td>yes</td>
            <td>Rover</td>
            <td></td>
        </tr>
        <tr>
            <td>cat</td>
            <td>brown</td>
            <td></td>
            <td>Princess</td>
            <td></td>
        </tr>
    </table>
</html>

Note: This runs through the tree twice but without extension. Exact match for desired output without extensions would require to mimic key mechanism like this: run through the tree adding new keys (name of element plus attributes' names) to a param, then again for every key run through the tree filtering node by key (could be a little optimization keeping a node set for non matching elements...). Worst case (every node with distinc key) will pass trough a node: N (for key building) + (N + 1) * N / 2

Alejandro
See a solution without any assumptions :)
Dimitre Novatchev
+2  A: 

This transformation doesn't make any assumptions about the sets having the same number of attributes -- no assumptions at all.

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:ext="http://exslt.org/common"
 exclude-result-prefixes="ext">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:key name="kAnimalByProperties" match="animal"
  use="concat(@atype, .)"/>

 <xsl:variable name="vrtfNewDoc">
  <xsl:apply-templates select="/pets/*">
    <xsl:sort select="name()"/>
  </xsl:apply-templates>
 </xsl:variable>

 <xsl:template match="pets/*">
   <animal atype="{name()}">
     <xsl:copy-of select="@*"/>
     <xsl:for-each select="@*">
       <xsl:sort select="name()"/>
         <attrib>|<xsl:value-of select="name()"/>|</attrib>
     </xsl:for-each>
   </animal>
 </xsl:template>

 <xsl:template match="/">
   <xsl:for-each select="ext:node-set($vrtfNewDoc)">
     <xsl:for-each select=
     "*[generate-id()
       =generate-id(key('kAnimalByProperties',
                        concat(@atype, .)
                        )[1]
                    )
       ]">
        <table border="1">
          <tr>
            <td>Element Name</td>
            <xsl:for-each select="*">
              <td><xsl:value-of select="translate(.,'|','')"/></td>
            </xsl:for-each>
          </tr>
          <xsl:for-each select=
          "key('kAnimalByProperties', concat(@atype, .))">
            <xsl:variable name="vcurAnimal" select="."/>
            <tr>
              <td><xsl:value-of select="@atype"/></td>
              <xsl:for-each select="*">
                <td>
                  <xsl:value-of select=
                   "$vcurAnimal/@*[name()=translate(current(),'|','')]"/>
                </td>
              </xsl:for-each>
            </tr>
          </xsl:for-each>
        </table>
        <p/>
     </xsl:for-each>
   </xsl:for-each>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<pets>
    <dog name="Frank" cute="yes" color="brown" type="Lab"/>
    <cat name="Fluffy" cute="yes" color="brown"/>
    <cat name="Lucy" cute="no" color="brown"/>
    <dog name="Spot" cute="no" color="brown"/>
    <dog name="Rover" cute="yes" color="brown"/>
    <dog name="Rupert" cute="yes" color="beige" type="Pug"/>
    <cat name="Simba" cute="yes" color="grey"/>
    <cat name="Princess" color="brown"/>
</pets>

the wanted, correct result is produced:

<table border="1">
   <tr>
      <td>Element Name</td>
      <td>color</td>
      <td>cute</td>
      <td>name</td>
   </tr>
   <tr>
      <td>cat</td>
      <td>brown</td>
      <td>yes</td>
      <td>Fluffy</td>
   </tr>
   <tr>
      <td>cat</td>
      <td>brown</td>
      <td>no</td>
      <td>Lucy</td>
   </tr>
   <tr>
      <td>cat</td>
      <td>grey</td>
      <td>yes</td>
      <td>Simba</td>
   </tr>
</table>
<p/>
<table border="1">
   <tr>
      <td>Element Name</td>
      <td>color</td>
      <td>name</td>
   </tr>
   <tr>
      <td>cat</td>
      <td>brown</td>
      <td>Princess</td>
   </tr>
</table>
<p/>
<table border="1">
   <tr>
      <td>Element Name</td>
      <td>color</td>
      <td>cute</td>
      <td>name</td>
      <td>type</td>
   </tr>
   <tr>
      <td>dog</td>
      <td>brown</td>
      <td>yes</td>
      <td>Frank</td>
      <td>Lab</td>
   </tr>
   <tr>
      <td>dog</td>
      <td>beige</td>
      <td>yes</td>
      <td>Rupert</td>
      <td>Pug</td>
   </tr>
</table>
<p/>
<table border="1">
   <tr>
      <td>Element Name</td>
      <td>color</td>
      <td>cute</td>
      <td>name</td>
   </tr>
   <tr>
      <td>dog</td>
      <td>brown</td>
      <td>no</td>
      <td>Spot</td>
   </tr>
   <tr>
      <td>dog</td>
      <td>brown</td>
      <td>yes</td>
      <td>Rover</td>
   </tr>
</table>
<p/>
Dimitre Novatchev
@Dimitre: Yes. This doesn't make any assumptions by two pass transformation with extensions functions. I think that the "lead me to a better solution" part can be cover by producing one header (grouping attributes by name as CVS solutions). Also, I think your output markup is better! Editing mine.
Alejandro
@Alejandro: If you use CSV, then it would be quite difficult to obtain every attribute name in order -- later when you need it.
Dimitre Novatchev
@Dimitre - I need to look into why, but I am only getting the headers and no values underneath (only the name of the element shows). I copied and pasted it as is as a new file. I won't be able to look into it for a while.. hopefully sometime today though. :S
developer
@iHeartGreek: It may be that the XSLT processor you are using does not implement EXSLT. In this case, use its own xxx:node-set() extension function. For example, for MSXML, the namespace in which this extension belongs is: "urn:schemas-microsoft-com:xslt"
Dimitre Novatchev
@Dimitre: Yes, I believe the processor does not implement EXSLT. Though, I am unsure what your above comment really means regarding what I need to do to get this to work.. :(
developer
@iHeartGreek: What XSLT processor are you using?
Dimitre Novatchev
@Dimitre: Well.. I am using Visual Studio 2010...
developer
@iHeartGreek: VS2010 uses XslCompiledTransform and this XSLT processor works with EXSLT's node-set() extension... ???Just to be on the safe side, replace `xmlns:ext="http://exslt.org/common"` with `xmlns:msxsl="urn:schemas-microsoft-com:xslt"` and `ext:node-set($vrtfNewDoc)` with `msxsl:node-set($vrtfNewDoc)` -- and try again.
Dimitre Novatchev
@Dimitre: I now have <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:msxsl="urn:schemas-microsoft-com:xslt"> and I replace the node-set function too. I still get empty <td> tags in the output. :( Not sure what is going wrong for me.
developer
@iHeartGreek: This is a *bug* in VS2010. Notice that in my code you have: `<attrib>|<xsl:value-of select="name()"/>|</attrib>` However, when copied and pasted in VS2010 this becomes (the taga and the value on separate lines):` <attrib> |<xsl:value-of select="name()"/>| </attrib>`Please, fix this by manual editing. :(
Dimitre Novatchev
@Dimitre: Geez!!! That's all it was?? It works now! I never would have caught that. Thanks!!! :D :D (oh also.. you gave me a comment saying you gave me +1, but my score shows up as 0.. would this be because someone downvoted my question? how can I know? :S )
developer
@iHeartGreek: Glad this was finally sorted out. As for the +1, somehow I hadn't really upvoted your question -- must be a race condition between me writing the coment and me upvoting the question :). +1 just now -- this time for real :)
Dimitre Novatchev
@Dimitre: haha thanks! :)
developer
@Dimitre: Sorry to bother you again! Today I was taking your example and applying it to my actual project. I have trouble getting the element name displayed with my implementation.. should I post this as a new question or describe my issue in another comment?
developer
@iHeartGreek: That means that in your project you have something different than the problem as explained here. Please, ask a new question, providing a meaningful (but as minimal as possible) example.
Dimitre Novatchev
@Dimitre: Thanks for responding. I have posted a new question.
developer