tags:

views:

105

answers:

3

I have a rather complicated xslt sheet transforming one xml format to another using templates. However, in the resulting xml, I need to have all the empty elements excluded. How is that done?

This is how the base xslt looks like:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:far="http://www.itella.com/fargo/fargogate/" xmlns:a="http://tempuri.org/XMLSchema.xsd" xmlns:p="http://tempuri.org/XMLSchema.xsd"&gt;
    <xsl:import href="TransportCDMtoFDM_V0.6.xsl"/>
    <xsl:import href="ConsignmentCDMtoFDM_V0.6.xsl"/>
    <xsl:template match="/">
        <InboundFargoMessage>
            <EdiSender>
                <xsl:value-of select="TransportInformationMessage/SenderId"/>
            </EdiSender>
            <EdiReceiver>
                <xsl:value-of select="TransportInformationMessage/RecipientId"/>
            </EdiReceiver>
            <EdiSource>
                <xsl:value-of select="TransportInformationMessage/Waybill/Parties/Consignor/Id"/>
            </EdiSource>
            <EdiDestination>FARGO</EdiDestination>
            <Transportations>
                <xsl:for-each select="TransportInformationMessage/TransportUnits/TransportUnit">
                    <xsl:call-template name="transport"/>
                </xsl:for-each>
                <xsl:for-each select="TransportInformationMessage/Waybill/TransportUnits/TransportUnit">
                    <xsl:call-template name="transport"/>
                </xsl:for-each>
                <xsl:for-each select="TransportInformationMessage/Waybill">
                    <EdiImportTransportationDTO>
                        <Consignments>
                            <xsl:for-each select="Shipments/Shipment">
                                <xsl:call-template name="consignment"/>
                            </xsl:for-each>
                        </Consignments>
                        <EdiTerminalDepartureTime>
                            <xsl:value-of select="DatesAndTimes/EstimatedDepartureDateTime"/>
                            <xsl:value-of select="DatesAndTimes/DepartureDateTime"/>
                        </EdiTerminalDepartureTime>
                        <EdiAgentTerminalArrivalDate>
                            <xsl:value-of select="DatesAndTimes/EstimatedArrivalDateTime"/>
                            <xsl:value-of select="DatesAndTimes/ArrivalDateTime"/>
                        </EdiAgentTerminalArrivalDate>
                        <EdiActivevehicle>
                            <xsl:value-of select="Vehicle/TransportShiftNumber"/>
                        </EdiActivevehicle>
                        <EdiConveyerZipCodeTown><xsl:text> </xsl:text></EdiConveyerZipCodeTown>
                    </EdiImportTransportationDTO>
                </xsl:for-each>
            </Transportations>
        </InboundFargoMessage>
    </xsl:template>
</xsl:stylesheet>

What needs to be added, so that empty elements are left out?

For example, a snippet from the resulting xml:

<?xml version="1.0" encoding="UTF-8"?>
<InboundFargoMessage xmlns:p="http://tempuri.org/XMLSchema.xsd"
        xmlns:far="http://www.itella.com/fargo/fargogate/"
        xmlns:a="http://tempuri.org/XMLSchema.xsd"&gt;
    <EdiSender>XXXX</EdiSender>
    <EdiReceiver>YYYY</EdiReceiver>
    <EdiSource>TR/BAL/IST</EdiSource>
    <EdiDestination>FARGO</EdiDestination>
    <Transportations>
        <EdiImportTransportationDTO>
            <Consignments>
                <EdiImportConsignmentDTO>
                    <ConsignmentLines>
                        <EdiImportConsignmentLineDTO>
                            <DangerousGoodsItems>
                                <EdiImportDangerGoodsItemDTO>
                                    <EdiKolliTypeOuter/>
                                    <EdiKolliTypeInner/>
                                    <EdiTechnicalDescription/>
                                    <EdiUNno/>
                                    <EdiClass/>
                                    <EdiDangerFactor/>
                                    <EdiEmergencyTemperature/>
                                </EdiImportDangerGoodsItemDTO>
                            </DangerousGoodsItems>
                            <BarCodes>
                                <EdiImportConsignmentLineBarcodeDTO/>
                            </BarCodes>
                            <EdiNumberOfPieces>00000002</EdiNumberOfPieces>
                            <EdiGrossWeight>0.000</EdiGrossWeight>
                            <EdiHeight/>
                            <EdiWidth/>
                            <EdiLength/>
                            <EdiGoodsDescription/>
                            <EdiMarkingAndNumber/>
                            <EdiKolliType>road</EdiKolliType>
                            <EdiCbm/>
                            <EdiLdm/>
                        </EdiImportConsignmentLineDTO>

That really needs to be:

<?xml version="1.0" encoding="UTF-8"?>
<InboundFargoMessage xmlns:p="http://tempuri.org/XMLSchema.xsd"
        xmlns:far="http://www.itella.com/fargo/fargogate/"
        xmlns:a="http://tempuri.org/XMLSchema.xsd"&gt;
    <EdiSender>XXXX</EdiSender>
    <EdiReceiver>YYYY</EdiReceiver>
    <EdiSource>TR/BAL/IST</EdiSource>
    <EdiDestination>FARGO</EdiDestination>
    <Transportations>
        <EdiImportTransportationDTO>
            <Consignments>
                <EdiImportConsignmentDTO>
                    <ConsignmentLines>
                        <EdiImportConsignmentLineDTO>
                            <DangerousGoodsItems/>
                            <BarCodes/>
                            <EdiNumberOfPieces>00000002</EdiNumberOfPieces>
                            <EdiGrossWeight>0.000</EdiGrossWeight>
                            <EdiKolliType>road</EdiKolliType>
                        </EdiImportConsignmentLineDTO>

In other words: Empty elements should be left out.

A: 

This is probably the simplest way:

<xsl:for-each select="Nodes/Node[text() != '']">

</xsl:for-each>

If you have control of the XML generation then don't add the root node if there is no children. Regardless of which way you choose XSL is quite verbose.

ChaosPandion
But wouldn't I have to add this to each and every node? That would not really be simple at all, because there are a few hundred nodes in total!
Fedor Steeman
@Fedor - XSL was not designed with succinctness in mind. :(
ChaosPandion
@Fedor - Looking at your update this may meet your needs.
ChaosPandion
@ChaosPandion - All right I will give it a try... :-(
Fedor Steeman
@ChaosPandion. You might want to correct that to for-each. Having said that, almost any solution that uses xsl:for-each can be better phrased using xsl:apply-templates as in Dimitre's solution.
Nic Gibson
@newt - Thanks.
ChaosPandion
This might produce unwanted results if your elements have mixed content - `<Node><b>foo</b></Node>` will get filtered out, since it has no text children.
Robert Rossney
+6  A: 

The provided (partial) XSLT code illustrates well an XSLT antipattern. Try almost always to avoid the use of <xsl:for-each>.

Below there is a sample XML document and a transformation which copies all nodes with the exception of the "empty" elements. Here by "empty" we mean either childless, or with one child whitespace-only child node.

XML Document:

<a>
 <b>
   <c>  </c>
   <d/>
   <e>1</e>
 </b>
</a>

Transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:strip-space elements="*"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "*[not(node())]
  |
   *[not(node()[2])
   and
     node()/self::text()
   and
     not(normalize-space())
     ]
  "/>
</xsl:stylesheet>

Result:

<a>
   <b>
      <e>1</e>
   </b>
</a>

Do note:

  1. The use of the Identity Rule.

  2. How we override the Identity Rule with a template that only matches "empty" elements. As this template does nothing (has no body at all), this doesn't copy ("deletes") the "empty" elements.

Using and overriding the Identity Rule is the most important XSLT design pattern.

Dimitre Novatchev
I think I'd prefer `*[not(*) and not(normalize-space())]`. Yes, elements containing only comments and/or processing instructions will get filtered out by this, but I'd guess that's probably not undesirable. That said, your comment about the original being an XSLT antipattern is spot on.
Robert Rossney
A colleagues passed me a similar solution that now appears to work, so I will mark this answer as *the* solution.
Fedor Steeman
@Robert-Rossney: I'd prefer not to work in guess mode exceedingly -- this is what SO is for: users can ask new, more specific questions, not adjusting their original question doezens of time. Thanks for your appreciation.
Dimitre Novatchev
A: 

There are some tricky cases where Dimitre's answer (which is certainly the right approach) might behave unexpectedly. For instance, if you've refactored your XSLT to use the identity pattern (which you should), and you have created a template like this:

<xsl:template match="Vehicle/TransportShiftNumber[. != '123']">
   <EdiActivevehicle>
      <xsl:value-of select="."/>
   </EdiActivevehicle> 
</xsl:template>

the transform may still create empty EdiActivevehicle elements if TransportShiftNumber is empty.

Ordinarily, if multiple templates match a node, the one that's more specific will be selected. "More specific" typically means that patterns that have a predicate will beat out patterns that don't. (The actual conflict-resolution rules are more involved; see section 5.5 of the XSLT recommendation.) In this case, both the above template and the empty-element template use predicates, and thus both have the same priority.

So the XSLT processor will do one of two things: it will report an error (that's allowed, though I've never seen an XSLT processor that unfriendly), or it will select the template that appears latest in the stylesheet.

There are two ways to fix this. Either put the empty-element-filtering template at the bottom of the stylesheet, or explicitly assign it a priority that's higher then 0.5 (which is the default value for most patterns that have predicates):

I'd probably do the latter, because I generally structure stylesheets with the expectation that the ordering of templates is not significant and I don't want any nasty surprises if I start moving things around. But I'd sure put a comment in there explaining myself: I've never seen anyone actually use an explicit priority on a template.

Robert Rossney