views:

28

answers:

2

Hello,

The scenario is the following. I have a plain text file which contains 2,000,000 lines with an ID. This list of IDs needs to be converted to a simple XML file. The following code works fine as long as there are only some thousand entries in the input file.

def xmlBuilder = new StreamingMarkupBuilder()
def f = new File(inputFile)
def input = f.readLines()
def xmlDoc = {
  Documents {
    input.each {
      Document(myAttribute: it)
    }
  }
}

def xml = xmlBuilder.bind(xmlDoc)
f.write(xml)

If the 2,000,000 entries are processed, I'm getting an OutOfMemoryException for the Java heap (set to 1024M). Is there a way to improve the above code so that it's able to handle large amounts of data?

Cheers, Robert

+3  A: 

The issue with that solution is that it is loading everything into memory before writing it out...

This might be a better solution, as I believe it should be writing the data out to the file output.xml as it processes input.txt.

import groovy.xml.MarkupBuilder

new File( 'output.xml' ).withWriter { writer ->
  def builder = new MarkupBuilder( writer )
  builder.Documents {
    new File( 'input.txt' ).eachLine { line ->
      Document( attr: line )
    }
  }
}
tim_yates
This is working perfectly. Thanks a lot for your help!
straurob
A: 

here's your problem: def input = f.readLines() ;-)

Steven