I am crunching through many gigabytes of text data and I was wondering if there is a way to improve performance. For example when going through 10 gigabytes of data and not processing it at all, just iterating line by line, it takes about 3 minutes.
Basically I have a dataIterator wrapper that contains a BufferedReader. I continuously call this iterator, which returns the next line.
Is the problem the number of strings being created? Or perhaps the number of function calls. I don't really know how to profile this application because it get compiled as a jar and used as a STAF service.
Any and all ideas appreciated?