views:

292

answers:

2

Hey,

I'm trying to seek through a RandomAccessFile, and as part of an algorithm I have to read a line, and then seek backwards from the end of the line

E.g

String line = raf.readLine();
raf.seek (raf.getFilePointer() - line.length() + m.start() + m.group().length());

//m is a Matcher for regular expressions

I've been getting loads of off-by-one errors and couldn't figure out why. I just discovered it's because some files I'm reading from have UNIX-style linefeeds, \r\n, and some have just windows-style \n.

Is there an easy to have the RandomAccessFile treat all linefeeds as windows-style linefeeds?

+1  A: 

No. RandomAccessFile and related abstractions (including the underlying file systems) model files as an indexable sequence of bytes. They neither know or care about lines or line terminations.

What you need to do is record the actual positions of line starts rather than trying to figure out where they are based on assumptions about what the line termination sequence is. Alternatively, use an line reader that captures the line termination sequence for each line that it reads, either as part of the line or in an attribute that can be accessed after reading each input line.

Alternatively, convert all the files to use DOS line termination sequences before you open them for random access.

Stephen C
This wasn't an option as I had to read the line first to decide if I could backtrack over it. Thanks for the input.
waitinforatrain
+1  A: 

You could always back the stream up two bytes and re-read them to see if it is \r \n or (!\r)\n:

String line = raf.readLine();
raf.seek(raf.getFilePointer()-2);
int offset = raf.read() == '\r' ? 2 : 1;
raf.read(); //discard the second character since you know it is either \n or EOF by definition of readLine
raf.seek (raf.getFilePointer() - (line.length()+offset) + m.start() + m.group().length());

I'm not sure exactly where you are trying to place the file pointer, so adjust the 2/1 constants appropriately. You may also need to add an extra check for blank lines (\n\n) if they occur in your file, as if it shows up you might get stuck in an infinite loop without code to step past it.

M. Jessup
Thanks, this is what I had to do in the end. I asked because I had a lot of these reads in the code.At the start of the code I checked for a '\r' at the end of the line. If it matched I'd set a variable to 1, otherside to 0. Then just added this variable onto raf.seek(...).Thanks for the help
waitinforatrain