I have some data processing code which uses the following recipe:
- Read in as much data as will fit in memory (call this a 'chunk')
- Perform processing on the chunk
- Write out processed chunk to disk
- Repeat
- ...
- Merge all the processed chunks to get the final answer.
This last stage is most efficient when there are as few chunks as possible, so I want the first stage to read in as much data as will fit in memory. I can do this by querying Runtime.freeMemory().
However, this means I need to call System.gc(), or the number returned by Runtime.freeMemory() is much smaller than the amount of memory I could safely allocate.
I have heard a number of authorities say that calling System.gc() explicitly is a bad idea. Is there any way I can avoid this?