views:

915

answers:

3

I have a very large XML file which I need to transform into another XML file, and I would like to do this with XSLT. I am more interested in optimisation for memory, rather than optimisation for speed (though, speed would be good too!).

Which Java-based XSLT processor would you recommmend for this task?

Would you recommend any other way of doing it (non-XSLT?, non-Java?), and if so, why?

The XML files in questions are very large, but not very deep - with millions of rows (elements), but only about 3 levels deep.

+1  A: 

See Saxon support for streaming mode. http://www.saxonica.com/documentation/sourcedocs/serial.html

If this streaming mode isn't for you, you can try to use tiny tree mode of Saxon, which is optimized for smaller memory usage. (It is default anyway)

Peter Štibraný
+2  A: 

You could consider STX, whose Java implementation is Joost. Since it is similar to XSLT, but being a stream processor it is able to process enormous files using very little RAM.

Joost is able to be used as a standard javax.xml.transform.TransformerFactory

Stephen Denne
+1  A: 

At present there are only three XSLT 2.0 processors known and from them Saxon 9.x is probably the most efficient (at least according to my experience) both in speed and in memory utilisation. Saxon-SA (the schema-aware version of Saxon, not free as the B (basic) version) has special extensions for streamed processing.

From the various existing XSLT 1.0 processors, .NET XslCompiledTransform (C#-based, not Java!) seems to be the champion.

In the Java-based world of XSLT 1.0 processors Saxon 6.x again is pretty good.

Dimitre Novatchev