tags:

views:

94

answers:

2

I need to transform a lot of XML files (Fedora export) into a different kind of XML. Trying to do it with XSL stylesheets and checking with the msxsl transformer.

Supposedly I have xml file like this (assuming there are actually other nodes inside AAA, OBJ, amd all other nodes), Source.XML:

<DOC>
<AAA>
    <STUFF>example</STUFF>
    <OBJ>
        <OBJVERS id="A1" CREATED="2008-02-18T13:28:08.245Z"/>
        <OBJVERS id="A2" CREATED="2008-02-19T10:42:41.965Z"/>
        <OBJVERS id="A13" CREATED="2009-03-16T12:43:11.703Z"/>
    </OBJ>
</AAA>
<FFF/>
<GGG/>
<DDD>
    <FILE />
</DDD>
</DOC>

Which I need to look something like this (Target.XML):

    <MYOBJ>
      <ELEM>contents of OBJVERS with the biggest id OR 
creation date (whichever is easier to do) go here</ELEM>
      <IMAGE> contents of <FILE> node go here</IMAGE>
    </MYOBJ>

The main problem that I have is that since I am new to XSL (and for this particular task do not have enough time to learn it properly) is that I can't understand how to tell XSL processor not to process anything else, I keep getting output from , for example.

Update: basically, I solved this problem meanwhile. I will post my own answer and close the question.

Update2: OK, Andrew's answer works, too, so I am just accepting it. :)

+1  A: 

This isn't the complete solution, in that it doesn't sort the OBJVERS before selecting the first one. But if you can solve the problem of selecting the right OBJVERS, then I think this will do the rest.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml"/>
<xsl:template match="/">
    <MYOBJ>
        <xsl:for-each select="/DOC/AAA/OBJ/OBJVERS[position()==1]">
            <ELEM><xsl:copy-of select="*"/></ELEM>
        </xsl:for-each>
        <IMAGE><xsl:copy-of select="/DOC/DDD/FILE/*" /></IMAGE>
    </MYOBJ>
</xsl:template>
Andrew Arnott
There is no "==" operator in XPath. Any compliant XSLT processor will raise an error even in the compilation stage. This answer doesn't solve the problem at all.
Dimitre Novatchev
Too bad, Dimitre. I'm sorry that you'd give up at the first compilation error if you were asking this question.
Andrew Arnott
The next step, if this incorrect answer is not corrected or deleted will be to flag and report it. You have limited time to act.
Dimitre Novatchev
You've got serious issues, Dimitre. I never claimed my answer was complete. In fact I identified one of the problems myself. Flag and report it? Please. Report it as what? Spam? Just -1 it if you don't like it.
Andrew Arnott
+1  A: 

This question has been formulated very loosely and this is not helpful for providing a more meaningful solution.

This said, the below transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/">
      <MYOBJ>
        <ELEM>
           <xsl:for-each select="/*/AAA/OBJ/OBJVERS">
             <xsl:sort select="@CREATED" order="descending"/>

             <xsl:if test="position() = 1">
                <xsl:copy-of select="."/>
             </xsl:if>
           </xsl:for-each>
        </ELEM>

        <IMAGE>
          <xsl:copy-of select="/*/DDD/FILE"/>
        </IMAGE>
      </MYOBJ>
    </xsl:template>
</xsl:stylesheet>

when applied on the artificial and contrived provided XML document (that in fact has bad structuring and naming and goes against many principles of designing XML documents):

<DOC>
    <AAA>
     <STUFF>example</STUFF>
     <OBJ>
      <OBJVERS id="A1" CREATED="2008-02-18T13:28:08.245Z"/>
      <OBJVERS id="A2" CREATED="2008-02-19T10:42:41.965Z"/>
      <OBJVERS id="A13" CREATED="2009-03-16T12:43:11.703Z"/>
     </OBJ>
    </AAA>
    <FFF/>
    <GGG/>
    <DDD>
     <FILE />
    </DDD>
</DOC>

produces what one could guess is the wanted result:

<MYOBJ>
   <ELEM>
      <OBJVERS id="A13" CREATED="2009-03-16T12:43:11.703Z"/>
   </ELEM>
   <IMAGE>
      <FILE/>
   </IMAGE>
</MYOBJ>
Dimitre Novatchev
Thanks a lot for your time and answer.The contrived XML, by the way, is (quite a simplified version, but structurally correct) of what Fedora Commons software produces when you try to export the digital objects contained in it.
Gnudiff
@Gnudiff Glad the answer was helpful. I cannot believe that any useful XML vocabulary will have elements named "AAA", "DDD", "FFF" and "GGG".
Dimitre Novatchev
@Dimitre: No, the naming of the AAA/BBB tags is mine, that's true, I wanted to draw the attention to only the part of the document that was relevant, and had read zvon.org recently which uses tags like that in their examples, so I sort of drifted to that.That's true, however, that version tags of OBJVERS are not simple numbers, but have alphanumeric prefixes for some weird reason.
Gnudiff
@Gnudiff I don't think zvon.org are an authority in the XML/XSLT world. Anyone who respects this discipline needs a good book by the classic authors -- Michael Kay and/or Jeni Tennison.
Dimitre Novatchev