views:

240

answers:

7

Hi,

I run into the following errors when i try to store a large file into a string.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:2882)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:515)
    at java.lang.StringBuffer.append(StringBuffer.java:306)
    at rdr2str.ReaderToString.main(ReaderToString.java:52)

As is evident, i am running out of heap space. Basically my pgm looks like something like this.

FileReader fr = new FileReader(<filepath>);
sb = new StringBuffer();
char[] b = new char[BLKSIZ];

while ((n = fr.read(b)) > 0) 
     sb.append(b, 0, n);    

fileString = sb.toString();

Can someone suggest me why i am running into heap space error? Thanks.

+1  A: 

By default, Java starts with a very small maximum heap (64M on Windows at least). Is it possible you are trying to read a file that is too large?

If so you can increase the heap with the JVM parameter -Xmx256M (to set maximum heap to 256 MB)

I tried running a slightly modified version of your code:

public static void main(String[] args) throws Exception{
    FileReader fr = new FileReader("<filepath>");
    StringBuffer sb = new StringBuffer();
    char[] b = new char[1000];
    int n = 0;
    while ((n = fr.read(b)) > 0) 
         sb.append(b, 0, n);    

    String fileString = sb.toString();
    System.out.println(fileString);
}

on a small file (2 KB) and it worked as expected. You will need to set the JVM parameter.

Kris
@Kris. Thanks. My prog works for small files too. Its just that i can't tweak the JVM options as this piece of code goes in a client that accepts variable sized files and should ideally convert them to String, pass them over to a webservice.
Deepak Konidena
+2  A: 
  • You allocate a small StringBuffer that gets longer and longer. Preallocate according to file size, and you will also be a LOT faster.

  • Note that java is Unicode, the string likely not, so you use... twice the size in memory.

  • Depending on VM (32 bit? 64 bit?) and the limits set (http://www.devx.com/tips/Tip/14688) you may simply not have enough memory available. How large is the file actually?

TomTom
The file is about 30MB. I tried preallocating the StringBuffer with the filesize in bytes, but it wouldn't allocate so much.
Deepak Konidena
@Deepak: if you cannot preallocate to the expected size then you certainly cannot read it in incrementally. You must increase the heap size (and as TomTom indicated this will be at least double so your 30MB file will require at least 60MB heap space). Note that when you convert that to a string (`StringBuffer.toString()`), many current implementations create a new `String` which means you need double again (i.e. 120MB heap space). Or, you can do this incrementally some way.
Kevin Brock
+1  A: 

Kris has the answer to your problem.

You could also look at java commons fileutils' readFileToString which may be a bit more efficient.

extraneon
+4  A: 

You are running out of memory because the way you've written your program, it requires storing the entire, arbitrarily large file in memory. You have 2 options:

  • You can increase the memory by passing command line switches to the JVM:

    java -Xms<initial heap size> -Xmx<maximum heap size>
    
  • You can rewrite your logic so that it deals with the file data as it streams in, thereby keeping your program's memory footprint low.

I recommend the second option. It's more work but it's the right way to go.

EDIT: To determine your system's defaults for initial and max heap size, you can use this code snippet (which I stole from a JavaRanch thread):

public class HeapSize {    
     public static void main(String[] args){      
         long kb = 1024;  
         long heapSize = Runtime.getRuntime().totalMemory();    
         long maxHeapSize = Runtime.getRuntime().maxMemory();  
         System.out.println("Heap Size (KB): " + heapSize/1024);  
         System.out.println("Max Heap Size (KB): " + maxHeapSize/1024);  
     }    
}
Asaph
Heap Size (KB): 81280Max Heap Size (KB): 83392
Deepak Konidena
@Asaph - The problem with setting the heapsize is that i am using this code in a client that consumes a webservice. The client takes a file, converts it to string and passes it over to the webservice. So, i doubt how helpful the JVM option could be.? Thanks.
Deepak Konidena
I would be interested in knowing more about the second option though. Are you suggesting some sort of synchronisation mechanism?
Deepak Konidena
@Deepak Konidena: If you're dealing with XML data coming back from a web service, I recommend using a SAX parser. A SAX parser processes XML on a tag by tag basis as the data streams in which keeps the memory footprint low.
Asaph
@Asaph- No, i need to send a plain text file as a string.
Deepak Konidena
@Deepak Konidena: Ok, can you send the bytes along in the same loop that you're reading them in from the file?
Asaph
+1  A: 

Although this might not solve your problem, some small things you can do to make your code a bit better:

  • create your StringBuffer with an initial capacity the size of the String you are reading
  • close your filereader at the end: fr.close();
Fortega
+1  A: 

In the OP, your program is aborting while the StringBuffer is being expanded. You should preallocate that to the size you need or at least close to it. When StringBuffer must expand it needs RAM for the original capacity and the new capacity. As TomTom said too, your file is likely 8-bit characters so will be converted to 16-bit unicode in memory so it will double in size.

The program has not even encountered yet the next doubling - that is StringBuffer.toString() in Java 6 will allocate a new String and the internal char[] will be copied again (in some earlier versions of Java this was not the case). At the time of this copy you will need double the heap space - so at that moment at least 4 times what your actual files size is (30MB * 2 for byte->unicode, then 60MB * 2 for toString() call = 120MB). Once this method is finished GC will clean up the temporary classes.

If you cannot increase the heap space for your program you will have some difficulty. You cannot take the "easy" route and just return a String. You can try to do this incrementally so that you do not need to worry about the file size (one of the best solutions).

Look at your web service code in the client. It may provide a way to use a different class other than String - perhaps a java.io.Reader, java.lang.CharSequence, or a special interface, like the SAX related org.xml.sax.InputSource. Each of these can be used to build an implementation class that reads from your file in chunks as the callers needs it instead of loading the whole file at once.

For instance, if your web service handling routes can take a CharSequence then (if they are written well) you can create a special handler to return just one character at a time from the file - but buffer the input. See this similar question: http://stackoverflow.com/questions/2148394/how-to-deal-with-big-strings-and-limited-memory.

Kevin Brock
+1  A: 

Trying to read an arbitrarily large file into main memory in an application is bad design. Period. No amount of JVM settings adjustments/etc... are going to fix the core issue here. I recommend that you take a break and do some googling and reading about how to process streams in java - here's a good tutorial and here's another good tutorial to get you started.

Kevin Day