tags:

views:

256

answers:

1

Hello,

I have a question. I have the following source xml file:

Source xml:

<Container>
  <DataHeader>
    <c id="b" value="TAG" />
    <c id="g" value="Info" /> 
  </DataHeader>
  <Data>
    <Rows>
      <r no="1">
        <c id="b" value="uid1" uid="T.A.uid1" />
        <c id="g" value="uid1|tag1|attr1|somevalue1" />
      </r>
   <r no="1">
        <c id="b" value="uid1" uid="T.A.uid1" />
        <c id="g" value="uid1|tag1|attr2|somevalue2" />
      </r>
      <r no="2">
        <c id="b" value="uid1" uid="T.A.uid1" />
        <c id="g" value="uid1|tag2|attr3|somevalue3" />
      </r>
    <r no="10">
        <c id="b" value="uid2" uid="T.A.uid2" />
        <c id="g" value="uid2|tag1|attr1|somevalue4" />
      </r>
      <r no="11">
        <c id="b" value="uid2" uid="T.A.uid2" />
        <c id="g" value="uid2|tag2|attr3|somevalue5" />
      </r>
   </Rows>
  </Data>
</Container>

The element 'c' with id 'g' is important in the source xml. This is a concatened string which values are seperated by a '|'. We need this values to make the target xml. The element 'c' with id 'b' you can use to separate the 'uid'.

example and explantion of values:

 <c id="g" value="uid1|tag1|attr1|somevalue1" />
 **uid value** | element node | **attribute** | attribute value
 **uid1** | tag1 | **attr1** |somevalue1

Al elements with the same 'uid' have to be aggregated into 1 single "TestTag" element (see target xml). Al attributes (attr1, attr2) with same parent element (for example 'tag1') needs to be added to 1 element. I only can make use of xslt(xpath) 1.0.

The target xml file should look like this after transforming.

Target xml after transformed by xsl:

<Container>
 <TestTag>
    <object UID="T.A.uid1" Name="uid1"/>
    <tag1 attr1="somevalue1" attr2="somevalue2"/>
    <tag2 attr3="*somevalue3"/>
 </TestTag>
 <TestTag>
    <Iobject UID="T.A.uid2" Name="uid2"/>
    <tag1 attr1="somevalue4" />
    <tag2 attr3="somevalue5"/>
 </TestTag>
</Container>

What are possible solutions for transforming source xml to target xml? I tried several things but I'm stuck right now.

+2  A: 

This is not exactly difficult, but is mind-boggling due to extensive (yet necessary) nested use of substring-before() and substring-after().

<xsl:stylesheet 
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
  <!-- index <c> nodes by their @id + "uid value" -->
  <xsl:key name="kObject"    match="r/c" use="
    concat(@id, '|', @value)
  " />
  <!-- index <c> nodes by their @id + "uid value" -->
  <xsl:key name="kTagByUid"  match="r/c" use="
    concat(@id, '|', substring-before(@value, '|'))
  " />
  <!-- index <c> nodes by their @id + "uid value" + "tag name" -->
  <xsl:key name="kTagByName" match="r/c" use="
    concat(@id, '|',
      substring-before(
        @value, 
        substring-after(substring-after(@value, '|'), '|')
      )
    )
  " />

  <xsl:variable name="vTagId"  select="/Container/DataHeader/c[@value='TAG'][1]/@id" />
  <xsl:variable name="vInfoId" select="/Container/DataHeader/c[@value='Info'][1]/@id" />

  <!-- processing starts here -->
  <xsl:template match="Container">
    <xsl:copy>
      <!-- apply templates to unique <c @id=$vTagId> tags -->
      <xsl:apply-templates mode="tag" select="
        Data/Rows/r/c[@id=$vTagId][
          generate-id()
          =
          generate-id(key('kObject', concat(@id, '|', @value))[1])
        ]
      " />
    </xsl:copy>
  </xsl:template>

  <xsl:template match="c" mode="tag">
    <TestTag>
      <object UID="{@uid}" name="{@value}" />
      <!-- apply templates to unique <c @id="g"> tags -->
      <xsl:apply-templates mode="info" select="
        key('kTagByUid', concat($vInfoId, '|', @value))[
          generate-id()
          =
          generate-id(
            key(
              'kTagByName', 
              concat(@id, '|', 
                substring-before(
                  @value, 
                  substring-after(substring-after(@value, '|'), '|')
                )
              )
            )[1]
          )
        ]
      " />
    </TestTag>
  </xsl:template>

  <xsl:template match="c" mode="info">
    <!-- select 'uid1|tag1|' - it's the key to kTagByName -->
    <xsl:variable name="key"  select="substring-before(@value, substring-after(substring-after(@value, '|'), '|'))" />
    <!-- select 'tag1' - it's the element name -->
    <xsl:variable name="name" select="substring-before(substring-after($key, '|'), '|')" /> 

    <xsl:element name="{$name}">
      <xsl:for-each select="key('kTagByName', concat(@id, '|', $key))">
        <!-- select 'attr1|somevalue1' - it's the attribute definition -->
        <xsl:variable name="attrDef" select="substring-after(@value, $key)" />
        <!-- create an attribute -->
        <xsl:attribute name="{substring-before($attrDef, '|')}">
          <xsl:value-of select="substring-after($attrDef, '|')" />
        </xsl:attribute>
      </xsl:for-each>
    </xsl:element>
  </xsl:template>

</xsl:stylesheet>

generates:

<Container>
  <TestTag>
    <object UID="T.A.uid1" name="uid1" />
    <tag1 attr1="somevalue1" attr2="somevalue2"></tag1>
    <tag2 attr3="somevalue3"></tag2>
  </TestTag>
  <TestTag>
    <object UID="T.A.uid2" name="uid2" />
    <tag1 attr1="somevalue4"></tag1>
    <tag2 attr3="somevalue5"></tag2>
  </TestTag>
</Container>

Note that this does not pay attention to duplicate attribute definitions. If you happen to have uid1|tag1|attr1|somevalue1 and later uid1|tag1|attr1|othervalue1, then you will end up with one attribute: attr1="othervalue1" because in the <xsl:for-each> both get their turn, and the latter one wins (i.e. ends up in the output).

It is possible to cater for that as well, it would require one more key and one more Muenchian grouping, I'm going to leave that as an exercise for the reader. Heh. ;)

Tomalak
+1, nice answer [15 chars]
infant programmer
Hello Tomalak,Thanks for your help. Your solution works very good. But now i have another question. In the source xml there is also a DataHeader element.We don't want to match on the value of id beacause this is generated and the combination id and value can be difference every time. Our solution was to use xsl:key <xsl:key name="getHeaderValue" match="Container/DataHeader/c" use="@value"/> to get the id by value. I made two variables to get id's ('b' and 'g') by value. An example: <xsl:variable name="tagNameId" select="key('getHeaderValue', 'TAG')/@id"/>
TripleJ
The problem is that in your example your matching on id in the xsl:key and in the template match and i can't make use of a variable in xsl:key or in the template match. <xsl:template match="c[@id=$tagNameId]"> is not possible. How can we get the same result by using the solution above or do we have to rewrite te whole xslt to make it work?
TripleJ
@TripleJ: This requires a small change to the stylesheet. Nothing dramatic, especially since the code has been "messy" anyway. I've changed from fixed match expressions to template modes, and augmented a the keys. See the diff of the answer: http://stackoverflow.com/posts/2282801/revisions
Tomalak
@Tomalak: Thanks for your quick response! It works. Is this code efficient performance-wise and what do you mean exactly with "messy".
TripleJ
@TripleJ: Performance should be pretty okay, at least I don't know how this could be made much faster. By "messy" I mean that you could easily get a headache by looking at it. It is hard to understand, even by XSLT standards. Most of the complexity comes from the fact that you have data that *should* be in separate attributes/elements kludged together in a delimited string. String processing is something that XSLT (1.0) is genuinely bad at, and if you have *any chance* of changing the input XML, by all means, do.
Tomalak
@Tomalak: Hi, i have a question? Can you please explain to me what happens in the select of this line: <xsl:apply-templates mode="tag" select="Data/Rows/r/c[@id=$vTagId][generate-id() = generate-id(key('kObject', concat(@id, '|', @value))[1]) ]"/> . I tried to figure it out but i don't get it. The same story goes for the apply-templates with mode="info"
TripleJ
@TripleJ: It is a technique called "Muenchian grouping". Much has been written about it on the web. I also have written an own explanation here: http://stackoverflow.com/questions/948218/xslt-3-level-grouping-on-attributes/955527#955527 (look for the explanation in the lower part of my answer).
Tomalak