tags:

views:

49

answers:

1

I have an ASCII formatted file with 250k+ lines of text on which I need to perform 2 steps.

1) scan through the entire file and delineate sections by matching a given regular expression pattern.

2) read each section of data and parse subsections from it.

One option is to use line-oriented scan of the file utilizing a BufferedReader, test each line for a match and store the line number for matches.

Are there more efficient options perhaps utilizing the nio namespace?

A: 

Perhaps pump the file through a chain of streams ; one stream that only passes sections matching your regular expression, followed by a stream that performs the parsing step.

e.g.

OutputStream os = RegexFilterOutputStream(
                  new ParsingStuffOutputStream()
                  );
while(input not empty) {
    // write stuff from input to os
}
Adrian