I am trying to process files one at a time that are stored over a network. Reading the files is fast due to buffering is not the issue. The problem I have is just listing the directories in a folder. I have at least 10k files per folder over many folders.
Performance is super slow since File.list() returns an array instead of an iterable. Java goes off and collects all the names in a folder and packs it into an array before returning.
The bug entry for this is http://bugs.sun.com/view_bug.do;jsessionid=db7fcf25bcce13541c4289edeb4?bug_id=4285834 and doesn't have a work around. They just say this has been fixed for JDK7.
A few questions:
- Does anybody have a workaround to this performance bottleneck?
- Am I trying to achieve the impossible? Is performance still going to be poor even if it just iterates over the directories?
- Could I use the beta JDK7 builds that have this functionality without having to build my entire project on it?