tags:

views:

55

answers:

2

Is there any way to print empty elements like <tag></tag> rather than <tag /> using org.w3c.dom? I'm modifying XML files that need to be diff'ed against old versions of themselves for review.

If it helps, the code that writes the XML to the file:

TransformerFactory t = TransformerFactory.newInstance();
Transformer transformer = t.newTransformer();

DOMSource source = new DOMSource(doc);
StringWriter xml = new StringWriter();
StreamResult result = new StreamResult(xml);
transformer.transform(source, result);

File f = new File("output.xml");
FileWriter writer = new FileWriter(f);
BufferedWriter out = new BufferedWriter(writer);
out.write(xml.toString());
out.close();

Thanks.

+1  A: 

I'm assuming the empty elements are actually ELEMENT_NODEs with no children within the document. Try adding an empty text node to them instead. That may trick the writer into believing there is a text node there, so it will write it out as if there was one. But the text node won't output anything because it is an empty string.

Calling this method with the document as both parameters should do the trick:

private static void fillEmptyElementsWithEmptyTextNodes(
    final Document doc, final Node root)
{
    final NodeList children = root.getChildNodes();
    if (root.getType() == Node.ELEMENT_NODE &&
        children.getLength() == 0)
    {
        root.appendChild(doc.createTextNode(""));
    }

    // Recurse to children.
    for(int i = 0; i < children.getLength(); ++i)
    {
        final Node child = children.item(i);
        fillEmptyElementsWithEmptyTextNodes(doc, child);
    }
}
jdmichal
Nice idea, but this surprisingly doesn't work- elements with empty text nodes still get printed as `<tag />`.
Rob Lourens
+2  A: 

You may want to consider converting both the old and the new XML file to Canonical XML - http://en.wikipedia.org/wiki/Canonical_XML - before comparing them with e.g. diff.

James Clark has a small Java program to do so on http://www.jclark.com/xml/

Thorbjørn Ravn Andersen
The problem is that the XML files are being output in this format from a design program we use, so they're being modified both by hand and by this program, and now my script. So we'd have to run them through a Canonical XML converter every time, right?This still looks useful- I'm going to look into it more and mark you as the accepted answer- thanks!
Rob Lourens
Yes. Copy every file, do this every time on every copy and then actually compare. An excellent candidate for scripting if I ever saw one :)
Thorbjørn Ravn Andersen