views:

32

answers:

2

I have a log file that can get pretty big.

The information in my log file is in a certain format and I want to be retreiving them a seperate blocks of data.

For example,

This is the start.

Blah Blah

Blah Blah Blah Blah Blah Blah

Blah

This is the start.

Blah Blah

Blah Blah Blah Blah Blah Blah

Blah Blah Blah Blah Blah Blah

Blah Blah Blah Blah Blah Blah

Blah

I want to get information from the "this is the start" to before the start of next "this is the start". What is the best way to do this? My code is in c#.

+1  A: 

The following code will split the file into chunks delineated by the "This is the start." line and call a callback method to process each chunk:

public static void ProcessInChunks(string inputFilename,
    string delimiter, Action<IEnumerable<string>> processChunk)
{
    using (var enumerator = File.ReadLines(inputFilename).GetEnumerator())
    {
        if (!enumerator.MoveNext())
            // The file is empty.
            return;

        var firstLine = enumerator.Current;
        if (firstLine != delimiter)
            throw new InvalidOperationException(
                "Expected the first line to be a delimiter.");

        List<string> currentChunk = new List<string>();

        while (enumerator.MoveNext())
        {
            if (enumerator.Current == delimiter)
            {
                processChunk(currentChunk);
                currentChunk = new List<string>();
            }
            else
                currentChunk.Add(enumerator.Current);
        }
        processChunk(currentChunk);
    }

Usage:

ProcessInChunks(@"myfile.log", "This is the start.",
    chunk => { /* do something here */ });
Timwi
Thanks for the answer Timwi. I will try this. Another question I havei is, is this the best way to read a large file?
@user393148 — There is no simple, straightforward answer to a broad class of problems in programming. You always need to look at each individual situation. I’ve just edited the answer significantly to make it much more efficient for very large files. My previous version would load the entire file into memory, but the new version processes it incrementally.
Timwi
Thanks Timwi...
A: 

If you can't change the log creation process, the answer by @Timwi will work well. If you can adjust the log creation process, you could create new date-stamped log file names every time you want to write This is the start.. This will create multiple log files, but they will already be split in the desired way. Obviously if the text to find can change, this won't work.

Edward Leno
Thanks Edward. I am working on getting this into a standard format. Until then I have to use the workaround. Thanks