views:

279

answers:

7

Hello,

Is there any method to read a specific line from a text file ? In the API or Apache Commons. Something like :

String readLine(File file, int lineNumber)

I agree it's trivial to implement, but it's not very efficient specially if the file is very big.

Thanks in advance

+6  A: 
String line = FileUtils.readLines(file).get(lineNumber);

would do, but it still has the efficiency problem.

Alternatively, you can use:

 LineIterator it = IOUtils.lineIterator(
       new BufferedReader(FileReader("file.txt")));
 for (int lineNumber = 0; it.hasNext(); lineNumber++) {
    String line = (String) it.next();
    if (lineNumber == expectedLineNumber) {
        return line;
    }
 }

This will be slightly more efficient due to the buffer.

Take a look at Scanner.skip(..) and attempt skipping whole lines (with regex). I can't tell if it will be more efficient - benchmark it.

P.S. with efficiency I mean memory efficiency

Bozho
What is FileUtils? Is it from Java 7?
finnw
no, it is commons-io - the library specified in the question
Bozho
Efficiency? I think the real problem is that the first solution reads the **whole** file to memory...
abyx
yes, that's why it has an _efficiency problem_, a memory efficiency problem in particular.
Bozho
The input file can have thousands of lines (documents to be loaded in the DB) therefore reading the whole file into memory is discarded.
Lluis Martinez
+1  A: 

If the lines you were reading were all the same length, then a calculation might be useful.

But in the situation when the lines are different lengths, I don't think there's an alternative to reading them one at a time until the line count is correct.

pavium
And "same length" means same length in bytes, not characters (with variable length character encoding in mind)
MBO
Actually the input file is fixed length and ANSI, I forget to specify this in the question. The problem could be the line separator, the application must run both in Windows and Unix.
Lluis Martinez
+1  A: 

If you are going to work with the same file in the same way (looking for a text at certain line) you can index your file. Line number -> offset.

Mykola Golubyev
+3  A: 

Not that I'm aware of.

Be aware that there's no particular indexing on files as to where the line starts, so any utility method would be exactly as efficient as:

BufferedReader r = new BufferedReader(new FileReader(file));
for (int i = 0; i < lineNumber - 1; i++)
{
   r.readLine();
}
return r.readLine();

(with appropriate error-handling and resource-closing logic, of course).

Andrzej Doyle
+1  A: 

Unfortunately, unless you can guarantee that every line in the file is the exact same length, you're going to have to read through the whole file, or at least up to the line you're after.

The only way you can count the lines is to look for the new line characters in the file, and this means you're going to have to read each byte.

It will be possible to optimise your code to make it neat and readable, but underneath you'll always be reading the whole file.

If you're going to reading the same file over and over again you could parse the file and create an index storing the offsets of certain line numbers, for example the byte count of where lines 100, 200 and so on are.

Dave Webb
+1  A: 

Because files are byte and not line orientated - any general solutions complexity will be O(n) at best with n being the files size in bytes. You have to scan the whole file and count the line delimiters until you know which part of the file you want to read.

Andreas_D
A: 

guava has something similar:

List<String> Files.readLines(File file, Charset charset);

So you can do

String line = Files.readLines(file, Charsets.UTF_8).get(lineNumber);
finnw