tags:

views:

170

answers:

3

My java module gets a huge input xml from a mainframe. Unfortunately, the mainframe is unable to skip optional elements, with the result that I get a LOT of empty tags in my input :

So,

<SSN>111111111</SSN>
<Employment>
<Current>
<Address>
<line1/>
<line2/>
<line3/>
<city/>
<state/>
<country/>
</Address>
<Phone>
<phonenumber/>
<countryCode/>
</Phone>
</Current>
<Previous>
<Address>
<line1/>
<line2/>
<line3/>
<city/>
<state/>
<country/>    
</Address>
<Phone>
<phonenumber/>
<countryCode/>
</Phone>
</Previous>
</Employment>
<MaritalStatus>Single</MaritalStatus>

should be:

<SSN>111111111</SSN>
<MaritalStatus>SINGLE</MaritalStatus>

I use jaxb to unmarshall the input xml string that the mainframe sends it. Is there a clean/ easy way to remove all the empty group tags, or do I have to do this manuall in the code for each element. I have over 350 elements in my input xml, so I would love to it if jaxb itself had a way of doing this automatically?

Thanks, SGB

+3  A: 

You could preprocess using XSLT. I know it's considered a bit "Disco" nowadays, but it is fast and easy to apply.

From this tek-tips discussion, you could transform with XSLT to remove empty elements.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
  <xsl:template match="@*|node()">
    <xsl:if test=". != '' or ./@* != ''">
      <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>
blissapp
interesting. Thanks for your suggestion. I was hoping there would be a way to make jaxb do it automagically :)Does anybody know if it is possible to achieve the same in jaxb ?If not, It looks like I may have to try this.Thanks again.
SGB
+2  A: 

I think you'd have to edit your mainframe code for the best solution. When your mainframe generates the XML, you'll have to tell it not to output a tag if it's empty.

There's not much you can do on the client side I don't think. If the XML that you get is filled with empty tags, then you have no choice but to parse them all--after all, how can you tell if a tag is empty without parsing it in some way!

But maybe you could do a regex string replace on the XML text before JAX-B gets to it:

String xml = //get the XML
xml = xml.replaceAll("<.*?/>", "");

This will remove empty tags like "<city/>" but not "<Address></Address>".

Michael Angstadt
The Mainframe folks were supposed to send me only non-empty elements, However, their homegrown parser is having problems.They are correctly omitting leaf nodes that are empty.However, when it is a group/complex elements with child nodes, they are unable to do so. Hence my attempt to fix it on my side.
SGB
You should be delighted that you convinced your COBOL programmers to write out XML in the first place! I managed to do this in about 2004 (seems years ago!) and my COBOL programming friend Eamonn (who sadly lost a small fortune in realised employee option shares in when worldcom crashed) actually said "You know what, this XML thing might just be useful!". Eamonn also developed his own homegrown parser, there was a third party parser available, but he simply was not interested in running with someone else's code!
blissapp
+1  A: 

The only technique I'm aware of in JAXB to do this is by writing a custom XmlAdapter which collapses your empty strings to nulls.

The downside is that you'd have to add this as an annotation to every single element in your code, and if you have 350 of them, that's going to be tedious.

skaffman
Hi Skaffman,Could you please point me to an example ... perhaps a link?thanksSGB
SGB