views:

337

answers:

3

We have a process that outputs the contents of a large XML file to System.out.

When this output is pretty printed (ie: multiple lines) everything works. But when it's on one line Eclipse crashes with an OutOfMemory error. Any ideas how to prevent this?

+1  A: 

How do you print it on one line?

  • using several System.out.print(String s)
  • using System.out.println(String verybigstring)

in the second case, you need a lot more memory...

If you want more memory for eclipse, could try to increase eclipses memory by changing the -Xmx value in eclipse.ini

Fortega
Second option. If we just output this to a file using `FileWriter` or similar would this be more memory efficient?
Marcus
As long as you don't try to load the complete file in a single String, I guess you will be fine. As well as for Sysout as for FileWriters or others...
Fortega
We do load the file in a single String which works fine. It's only the output that causes a problem. Why does Eclipse need so much memory just to output 5MB?
Marcus
This should not be particular to Eclipse, and will depend on the Java Class Libraries. If you have a JDK, drill down into the implementation to see what is happening. I looked at 2 different vms, and both come down to Writer.write and allocate a new char[] the size of the string. This will eventually get sent through a sun.nio.cs.StreamEncoder for which I don't have source.
Andrew Niefer
@Andrew, why would we get an exception then with only the non pretty printed output?
Marcus
Note sure what's happening during the character conversion in the encoder, or what kind of stream is on the other side. Perhaps the newlines will trigger a flush which helps things?
Andrew Niefer
+1  A: 

I'm going to assume that you're building an org.w3c.Document, and writing it using a serializer. If you're hand-building an XML string, you're all but guaranteed to be producing something that's almost-but-not-quite XML, and I strongly suggest fixing that first.

That said, if you're writing to a stream from the serializer (and System.out is a stream), then you should be writing directly to the stream rather than writing to a string and printing that (which you'd do with a StringWriter). The reason for this is that the XML serializer will properly handle character encodings, while serializer to String to stream may not.


If you're not currently building a DOM, and are concerned about the memory requirements of doing so, then I suggest looking at the Practical XML library (which I maintain), in particular the builder package. It uses lightweight nodes, that are then output via a serializer using a SAX transform.


Edit in response to comment:

OK, you've got the serializer covered with XStream. I'm next going to assume that you are calling XStream.toXML(Object) to produce the string, and recommend that you call the variant toXML(Object, OutputStream), and pass it the actual output. The reason for this is that XML is very sensitive to character encoding, which is something that often breaks when converting strings to streams.

This may, of course, cause issues with building your POST request, particularly if you're using a library that doesn't provide you an OutputStream.

kdgregory
We are using REST webservices to send/receive XML. We use XStream to do the XML<->POJO conversion. So when we create the request (5MB xml) we have the data in a String which we specify in the body of our `POST` request.
Marcus
+2  A: 

Sounds like it is the Console panel blowing up. Consider limiting its buffer size.

EDIT: It's in Preferences. Search for Console.

Thorbjørn Ravn Andersen
and, BTW, who really wants to see a large 1-line XML file on the console? Pretty sure, the Console designer didn't forsee that ;-)
Andreas_D