views:

201

answers:

2

I have the following function:

private static void prettyPrint(Document doc, File destFile)
{
    TransformerFactory tfactory = TransformerFactory.newInstance();
    Transformer serializer;

    try
    {
        if( !destFile.getParentFile().exists() )
        {
            destFile.getParentFile().mkdirs();
        }

        serializer = tfactory.newTransformer();

        serializer.setOutputProperty(OutputKeys.INDENT, "yes");
        serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");

        try
        {
            serializer.transform(new DOMSource(doc),
                                 new StreamResult(new FileOutputStream(destFile)));
        }
        catch( FileNotFoundException e )
        {
            e.printStackTrace();
        }
    }
    catch (TransformerException e)
    {
        e.printStackTrace();
    }
}

I use it to "pretty print" my XML. However, it prints the attributes' values with double quotes around them, as opposed to single quotes. Now, I realize that XML is agnostic concerning double vs single quotes for values, but the customer I'm providing the XML for requires single quotes.

So, that being said, does anyone know of an output property I could set to tell the transformer to print single quotes instead of double quotes?

Thanks for your help,

B.J.

+1  A: 

I do not believe this is possible with the standard serializer. Any standards-compliant XML parser should handle double quotes on input. Can you find out why the customer's XML parsing is broken, and possibly get it fixed?

On another point, you are declaring in your output keys that the document will be UTF8, but you do not seem to be providing a UTF8-encoded writer. This will work just fine on Windows, but will fail on Solaris, where the default is NOT UTF8. For maximum portability you should ensure that your output stream will actually get written using UTF8 by explicitly telling Java. Just declaring it in the XML header is not enough.

Jim Garrison
Actually the code is not providing any sort of Writer. It's providing an OutputStream, which gets bytes and not chars. So the serializer will convert the chars to bytes using UTF-8 and write them to that stream. That isn't a problem.
Paul Clapham
It's not that the customer's XML parsing is broken, it's just that this XML is a kind of sample XML that we create by hand each release, whereas the "real" XML is created during part of a separate batch process. The batch process puts single quotes around attribute names, and the customer wants the sample XML to match the "real" XML as closely as possible. If there's no way to do it with the serializer, I suppose each release I will have to post-process the XML and replace all double quotes with single quotes, as Paul suggested.
Benny
+1  A: 

You could possibly post-process the XML to replace double quotes by single quotes, although that would be a risky process as text nodes could also contain double quotes which you wouldn't want to change. And attribute values could contain single quotes, which would work fine when surrounded by double quotes but which would have to be escaped if surrounded by single quotes. I think I'm talking myself out of this idea, but it might be made workable, I guess.

Paul Clapham