views:

326

answers:

4

I am running the following command on unix box.

java -Xms3800m -Xmx3800m org.apache.xalan.xslt.Process -out Cust.txt -in test13l.xml -xsl CustDetails.xsl

It is a java command, which calls Xalan processor to parse through the xml file (test131.xml) using the xsl stylesheet (CustDetails.xsl) and returns Cust.txt.

The command works fine and the output is generated. It takes 12 minutes to process an xml file size of 1.1 GB. It takes 22 minutes to process a file size of 1.44 GB. However, when I try to process a file size of 1.66 GB, it errors out with the following message:

(Location of error unknown)XSLT Error (java.lang.OutOfMemoryError): null

I have increased the java heap size to 3800 not sure what I can do more.

Many thanks for your help.

+1  A: 

Are you running a 64-bit Java process or a 32-bit Java process? How much memory do you actually have on the system? What is the full stack trace for your OOM? Which JVM version are you running? You can always run JConsole and dump the heap and open it in a tool like Eclipse MAT to see what objects are occupying the heap. Depending on your JVM version, you can run your process with -XX:+HeapDumpOnOutOfMemory and open the dump after the Java process runs out of memory.

Amir Afghani
A: 

Try setting -XX:MaxPermSize=256m

Bozho
A: 

Producing a giant DOM tree in the memory is not the way to go; find a way to feed your XSL Transformer with XML events, like SAX or StAX API. Never use DOM API on XML files that massive (1.1GB sounds scary).

If you are using Java 6, take a look at the packages javax.xml.transform.sax and javax.xml.transform.stax for reference what should your solution implement in order for this to work.

dimitko
+1  A: 

Xalan can use a DOM or SAX parser underneath.

DOM parsers usually try to read the whole file in once and build a tree out of it, consuming lots of memory on large files.

SAX parsers on the other hand fire events while parsing and thus don't need to keep the whole file in memory (but you can't access the tree as easily).

Make sure your Xalan uses a SAX parser underneath. You can find a description of how to do this here: http://xml.apache.org/xalan-j/usagepatterns.html#sax

Hardcoded