views:

144

answers:

4
+3  Q: 

Multiple File I/O

Hi All,

Sorry about the lame title.

I have a project I'm working on and I would appreciate any suggestions as to how I should go about doing the IO stuff.

Ok, I have 3 text files. One file contains many many lines of text. This is the same for the other 2 files. Let's call them File1, File2 and File3.

I need to create a text file, for the sake of explaining, I'll name it Result.txt.

Here's what needs to be done:

  1. Extract the first line of text from File1 and append it to Result.txt.
  2. Extract the first line of text from File2 and append it to the end of the first line in Result.txt.

  3. Extract the first line of text from File3 and append it to the end of the first line in Result.txt.

  4. Create a new line in Result.txt

  5. Repeat from 1 to 4.

Note: These files can be quite large.

Anyone have any ideas as to how to best approach this?

Thank you

-

Thank you all for your very helpful answers. I've learned alot from your advice and code samples!

+3  A: 

Here we go:

using (StreamWriter result = new StreamWriter("result.txt"))
{
    StreamReader file1 = new StreamReader("file1.txt");
    StreamReader file2 = new StreamReader("file2.txt");
    StreamReader file3 = new StreamReader("file3.txt");
    while (!file1.EndOfStream || !file2.EndOfStream || !file3.EndOfStream)
    {
        result.Write(file1.ReadLine() ?? "");
        result.Write(file2.ReadLine() ?? "");
        result.WriteLine(file3.ReadLine() ?? "");
    }
}

I built something similar a few months ago, but using a sightly different approach:

  • Create two threads, one for reading, another for writing
  • On first thread, read your input files, format a result line and append it to a StringBuilder
  • If StringBuilder is larger than n bytes, place it on a queue and signal write thread.
  • On write thread, take your buffer for queue and start to write async.
  • Do it until both threads finish their jobs

You'll need to learn how to synchronize two threads, but it's fun and, in my specific case, we got a good performance boost.

EDIT: A new version to Yuriy copy:

object locker = new object();
using (StreamWriter result = new StreamWriter("result.txt"))
{
    StreamReader file1 = new StreamReader("file1.txt");
    StreamReader file2 = new StreamReader("file2.txt");
    StreamReader file3 = new StreamReader("file3.txt");

    const int SOME_MAGICAL_NUMBER = 102400; // 100k?
    Queue<string> packets = new Queue<string>();
    StringBuilder buffer = new StringBuilder();
    Thread writer = new Thread(new ThreadStart(() =>
    {
        string packet = null;
        while (true)
        {
            Monitor.Wait(locker);
            lock (locker)
            {
                packet = packets.Dequeue();
            }
            if (packet == null) return;
            result.Write(packet);
        }
    }));
    writer.Start();

    while (!file1.EndOfStream || !file2.EndOfStream || !file3.EndOfStream)
    {
        buffer.Append(file1.ReadLine() ?? "");
        buffer.Append(file2.ReadLine() ?? "");
        buffer.AppendLine(file3.ReadLine() ?? "");

        if (buffer.Length > SOME_MAGICAL_NUMBER)
        {
            lock (locker)
            {
                packets.Enqueue(buffer.ToString());
                buffer.Length = 0;
                Monitor.PulseAll(locker);
            }
        }
    }

    lock (locker)
    {
        packets.Enqueue(buffer.ToString());
        packets.Enqueue(null); // done
        Monitor.PulseAll(locker);
    }
    writer.Join();
}
Rubens Farias
Suggest editing this to show the loop and in particular how to detect termination (i.e. ReadLine returns null).
itowlson
@itowlson, done
Rubens Farias
I kept the checks on, I would want it to throw an exception if one of the files is a different length, not just quit nicely. But that is up to the business rules. Considering speed is an issue, you might want to use String.Empty. And yes, I did copy paste most of your code, Thank You
Yuriy Faktorovich
Actually I like your previous version better, but if you wanna keep pulling hairs, `new ThreadStart` is redundant.
Yuriy Faktorovich
Thank you very much for taking the time to write this code and explain it. It's been very helpful.
baeltazor
@Yuriy, you always can change it in your copy =)
Rubens Farias
@Rubens Farias: Why waste time, I can just edit yours.
Yuriy Faktorovich
Great :-)......
Andres
+1  A: 

This looks pretty straightforward. Using binary reading instead of text (line by line) one might speed the process up.

Oleg Zhylin
yes, but you'll have to deal with line breaks yourself
Rubens Farias
+4  A: 
int i = 0;
using (StreamWriter result = new StreamWriter("result.txt"),
    StreamReader file1 = new StreamReader("file1.txt"),
    StreamReader file2 = new StreamReader("file1.txt"),
    StreamReader file3 = new StreamReader("file1.txt"))
{
    while(file1.Peek() != -1)
    {
        result.Write(file1.ReadLine());
        result.Write(file2.ReadLine());
        result.WriteLine(file3.ReadLine());
        if (i++ % 100 == 0) result.Flush();
    }
}
Yuriy Faktorovich
@itowlson: Thank you.
Yuriy Faktorovich
+4  A: 

I think here you can use the philosophy of the producer/consumer. You can have a thread (producer) reading each line from your 3 source files, concatenating the 3 lines and put the result in a queue (in memory). Meanwhile, another thread (consumer) is constantly reading the from this queue and writing the to your result.txt file.

1: producer thread
   Reads line n from file 1,2 and 3
   concatenates the contents of the 3 lines and push_back in the queue

2: consumer thread
   Check if the queue is empty. 
   If not, pop the first item in the queue and write to the result.txt
Andres
Can you read multiple files simultaneously?
Yuriy Faktorovich
In theory you can, with different file descriptor per thread, but I dont see any advantage doing that as long as all thread will be concurring for the same IO.
Andres
+1 lovely, just wrote that on my answer above
Rubens Farias
implemented below
Rubens Farias