tags:

views:

507

answers:

4

Hi,

I'm working in C#/.NET and I'm parsing a file to check if one line matches a particular regex. Actually, I want to find the last line that matches.

To get the lines of my file, I'm currently using the System.IO.StreamReader.ReadLine() method but as my files are very huge, I would like to optimize a bit the code and start from the end of the file.

Does anyone know if there is in C#/.NET a similar function to ReadLine() starting from the end of the stream? And if not, what would be, to your mind, the easiest and most optimized way to do the job described above?

A: 

Since you are using a regular expression I think your best option is going to be to read the entire line into memory and then attempt to match it.

Perhaps if you provide us with the regular expression and a sample of the file contents we could find a better way to solve your problem.

Andrew Hare
+6  A: 

Funny you should mention it - yes I have. I wrote a ReverseLineReader a while ago, and put it in MiscUtil.

In was in answer to this question on Stack Overflow - the answer contains the code, although it uses other bits of MiscUtil too.

It will only cope with some encodings, but hopefully all the ones you need. Note that this will be less efficient than reading from the start of the file, if you ever have to read the whole file - all kinds of things may assume a forward motion through the file, so they're optimised for that. But if you're actually just reading lines near the end of the file, this could be a big win :)

(Not sure whether this should have just been a close vote or not...)

Jon Skeet
It's nice to see that anyone can forget to Close or Dispose. :)
MusiGenesis
+1 for how damn thorough your LineReader is. Thanks for making this stuff available <3
womp
ok so the answer is yes there is an easy way: just copy and paste Jon's code :) Thanks
PierrOz
A: 

"Easiest" -vs- "Most optimized"... I don't think you're going to get both

You could open the file and read each line. Each time you find one that fits your criteria, store it in a variable (replacing any earlier instance). When you finish, you will have the last line that matches.

You could also use a FileStream to set the position near the end of your file. Go through the steps above, and if no match is found, set your FileStream position earlier in your file, until you DO find a match.

Brad Bruce
A: 

This ought to do what you're looking for, it might be memory heavy for what you need, but I don't know what your needs are in that area:

        string[] lines = File.ReadAllLines("C:\\somefilehere.txt");
        IEnumerable<string> revLines = lines.Reverse();
        foreach(string line in revLines) {
            /*do whatever*/
        }

It would still require reading every line at the outset, but it might be faster than doing a check on each one as you do so.

blesh
I don't think that would work. Even if memory were no object (you need twice the file size in memory), your EOL pair would be backwards. CR - LF would appear as LF - CR. ReadLine would not find anything.
Brad Bruce
lol you're right... thought that one out too quick
blesh
There.. I've updated my answer to work a little better. lmao. Sorry.
blesh
At the cost of a for loop, you could cut down the amount of memory, by half.Read all of the linesint numLines = lines.Lengthfor(x = numLines; x > -1; x --){// do whatever}
Brad Bruce