I have to deal with a directory of about 2 million xml's to be processed.
I've already solved the processing distributing the work between machines and threads using queues and everything goes right.
But now the big problem is the bottleneck of reading the directory with the 2 million files in order to fill the queues incrementally.
I've tried using the File.listFiles() method , but it gives me a java out of memory: heap space Exception. Any ideas?