views:

470

answers:

3

I've been tasked with writing some XSLT 2.0 to translate an XML document to another XML document. I'm relatively new to XSLT but I have learn alot during the days I've do this. During this time I have had to map simple values, i.e. 002 -> TH etc. This has been fine for small lists of less than 10 values, I used xsl:choose. However I need to map over 300 values from one list to another and vice versa. Each list has a value and textual description. The two list values do not always directly map, so I may have to compare textual descriptions and use default values if necessary.

I have two solutions to the problem:

  1. Use xsl:choose: This I think could be slow and possible hard to update if either of the lists changes.

  2. Have a XML document with the relationship between each list item. I would use an XPath expressions to retrieve an associated value: This is my preferred solution because I believe it will be more maintainable and easier to update. Although I'm not sure it is efficient.

What solution should I use, one of my suggestion, or is there a better way to map these values?

A: 

After reading your problem's description (before even reading your solutions) I thought that I'd tackle the problem via a mapping document, just as you describe in solution 2.

Urs Reupke
+1  A: 

Here is a way to do what you intend, using an <xsl:key> and otherwise following your method two.

The sample input file (data.xml):

<?xml version="1.0" encoding="utf-8"?>
<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

The sample map file (map.xml):

<?xml version="1.0" encoding="utf-8"?>
<map default="??">
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
</map>

The sample XSL stylesheet, explanation follows:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default-value" select="$map-doc/map/@default" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <xsl:variable name="mapped-value">
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="key('map', $raw-value)" />
      </xsl:for-each>
    </xsl:variable>
    <data>
      <xsl:choose>
        <xsl:when test="$mapped-value = ''">
          <xsl:value-of select="$default-value" />
        </xsl:when>
        <xsl:otherwise>
          <xsl:value-of select="$mapped-value" />
        </xsl:otherwise>
      </xsl:choose>
    </data>
  </xsl:template>
</xsl:stylesheet>

What this does is:

  • use document() to open map.xml, saving the resulting node-set to a variable
  • save the default value for further reference
  • prepare an <xsl:key> to work against the "map" node set
  • use <xsl:for-each> not as a loop, but as a means to switch the execution context before calling the key() function - otherwise key() would work against the "data" document and return nothing
  • find the corresponding node with the key() function, save it in a variable
  • check the variable value on output - if it is empty, use the default value
  • repeat (through <xsl:apply-templates>)

The credit for the neat <xsl:for-each> trick goes to Jeni Tennison, who described the technique on the XSL mailing list. Be sure to read the thread.

Output of running the stylesheet against data.xml:

<?xml version="1.0" encoding="utf-8"?>
<output>
  <data>RZ</data>
  <data>TH</data>
  <data>??</data>
</output>

All of this is XSLT 1.0. I'm convinced a better/more elegant version exists that makes use of the advantages XSLT 2.0 offers, but unfortunately I'm not overly familiar with XSLT 2.0. Maybe someone else posts a better solution.


EDIT

Through Dimitre Novatchev's hint in the comments, I was able to create a a considerably shorter and more preferable stylesheet:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" 
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:output method="xml" encoding="utf-8" indent="yes"/>

  <xsl:param name="map-file" select="string('map.xml')" />
  <xsl:variable name="map-doc" select="document($map-file)" />
  <xsl:variable name="default" select="$map-doc/map/default[1]" />
  <xsl:key name="map" match="/map/entry" use="@key" />

  <xsl:template match="/input">
    <output>
      <xsl:apply-templates select="data" />
    </output>
  </xsl:template>

  <xsl:template match="data">
    <xsl:variable name="raw-value" select="." />
    <data>
      <xsl:for-each select="$map-doc">
        <xsl:value-of select="(key('map', $raw-value)|$default)[1]" />
      </xsl:for-each>
    </data>
  </xsl:template>
</xsl:stylesheet>

However, this one requires a slightly different map file to work in XSLT 1.0:

<?xml version="1.0" encoding="utf-8"?>
<map>
  <entry key="001">RZ</entry>
  <entry key="002">TH</entry>
  <entry key="003">SC</entry>
  <!-- default entry must be last in document -->
  <default>??</default>
</map>
Tomalak
Here is an XSLT 2.0 solution :)
Dimitre Novatchev
@Tomalak: You could eliminate most of the default value processing by: <xsl:value-of select="(key('map', $raw-value) | $default-value)[1]" /> However, for this to work (in XSLT 1) the "default" element must follow all "entry" elements in document order. I will answer this question if you ask it :)
Dimitre Novatchev
@Tomalak: Apart from this possible refactoring, your solution is a very good one!
Dimitre Novatchev
Thanks for the hint. :) Factored in - now it looks quite similar to your solution.
Tomalak
@Tomalak: The rule that a node-set must be in document order are the same in XSLT 2, too. However, XPath 2 has a new datatype: the sequence, which by definition holds items in any specified order. So, if you compose a sequence by using the "," or "to" operator, you get your desired order
Dimitre Novatchev
@Tomalak: Re: Non-trivial XSLT. Yes, and if you post an XSLT answer to what they think is a C# only question, they even downvote you. Regardless that your answer solves the problem in a completely superior way to all C# solutions proposed. I am enjoying solving Project Euler now -- in XSLT of course
Dimitre Novatchev
+1  A: 

Here is an XSLT 2.0 solution.

Source XML file:

<input>
  <data>001</data>
  <data>002</data>
  <data>005</data>
</input>

"Mapping" xml file:

<map>
  <default>?-?-?</default>
    <input value="001">RZ</input>
    <input value="002">TH</input>
    <input value="003">SC</input>
</map>

XSLT transformation:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:param name="pmapFile" 
       select="'C:/temp/deleteMap.xml'" />

  <xsl:variable name="vMap" 
       select="document($pmapFile)" />

  <xsl:variable name="vDefault" 
       select="$vMap/*/default/text()" />

  <xsl:key name="kInputByVal" match="input" 
   use="@value" />

  <xsl:template match="/*">
    <output>
      <xsl:apply-templates/>
    </output>
  </xsl:template>

  <xsl:template match="data">
    <data>
        <xsl:sequence select= 
         "(key('kInputByVal', ., $vMap)[1]/text(),
           $vDefault
           )[1]
         "/>
    </data> 
  </xsl:template>
</xsl:stylesheet>

Output:

<output>
  <data>RZ</data>
  <data>TH</data>
  <data>?-?-?</data>
</output>

Do note the following:

  1. The use of the document() function to access the "mapping" xml document, which is stored in a separate XML file.

  2. The use of <xsl:key/> and the XSLT 2.0 key() function to determine and access each corresponding output value. The third argument specifies the xml document that must be accessed and indexed.

Dimitre Novatchev
Neat. :-) Is there a fundamental flaw in my approach or is all of it necessary to make it work in XSLT 1.0? My transformation seems so long in comparison.
Tomalak
@Tomalak: You could eliminate most of the default value processing by: <xsl:value-of select="(key('map', $raw-value) | $default-value)[1]" /> However, for this to work (in XSLT 1) the "default" element must follow all "entry" elements in document order. I will answer this question if you ask it :)
Dimitre Novatchev
Re: "default" element must be last: I would suspect that in XSLT 1.0, document order takes precedence over "node set concatenation" order, so that "(node[1]|node[2])" and "(node[2]|node[1])" yield an identical node set.
Tomalak
For some reason, there seems to be a nasty tendency here on SO that non-trivial XSLT questions go largely unnoticed while the OP looses interest in his own question. In terms of spent time vs. gainable reputation, answering them is not really worthwhile...
Tomalak
@Tomalak: The rule that a node-set must be in document order are the same in XSLT 2, too. However, XPath 2 has a new datatype: the sequence, which by definition holds items in any specified order. So, if you compose a sequence by using the "," or "to" operator, you get your desired order.
Dimitre Novatchev
Re: Non-trivial XSLT. Yes, and if you post an XSLT answer to what they think is a C# only question, they even downvote you. Regardless that your answer solves the problem in a completely superior way to all C# solutions proposed. I am enjoying solving Project Euler now -- in XSLT of course.
Dimitre Novatchev
Have fun. :-) I have read your blog post on the C# finger tree this morning. Enjoyable read, though way over my head I'm afraid. Too bad only spam bots left comments so far. Where is the answer you got down-voted for?
Tomalak
OBTW, incorporating your suggestion into the XSLT 1.0 code worked out nicely, see below. What made me think: I used to believe that <xsl:key> would create an index for later use. Now that I've learned that evaluation happens not before the call to key() - where does the speed benefit come from?
Tomalak
@Tomalak: An index is built the 1st time a key() function is evaluated for a key for a document. The speed gain comes if the key() function is used more than once. Certain XSLT processors may combine the initial parsing of a document with building all indexes. This speeds up even the 1st useof key()
Dimitre Novatchev
@Tomalak: Just try to implement a finger tree from Hinze/Patterson's article and the understanding of it will come gradually. It's fascinating, I agree. Just tell some people that solving certain problems feels better than sex and look at their concerned faces... :)
Dimitre Novatchev
@Tomalak: It was my answer to this question: http://stackoverflow.com/questions/451950/get-the-xpath-to-an-xelement THe answer is currently deleted, but you have over 10K points and (I heard) can see deleted answers.
Dimitre Novatchev
I can. I can even undelete them, though I don't think that this is a particularly useful capability. Regarding the finger tree: You would be surprised how far I am from being able to write my own implementation. Even I am surprised at times. ;-)
Tomalak