tags:

views:

140

answers:

4

I have a huge text file, size > 4GB and I want to replace some text in it programmatically. I know the line number at which I have to replace the text but the problem is that I do not want to copy all the text (along with my replaced line) to a second file. I have to do this within the source file. Is there a way to do this in C#?

The text which has to be replaced is exactly the same size as the source text (if this helps).

+8  A: 

Since the file is so large you may want to take a look at the .NET 4.0 support for memory mapped files. Basically you'll need to move the file/stream pointer to the location in the file, overwrite that location, then flush the file to disk. You won't need to load the entire file into memory.

For example, without using memory mapped files, the following will overwrite a part of an ascii file. Args are the input file, the zero based start index and the new text.

    static void Main(string[] args)
    {
        string inputFilename = args[0];
        int startIndex = int.Parse(args[1]);
        string newText = args[2];

        using (FileStream fs = new FileStream(inputFilename, FileMode.Open, FileAccess.Write))
        {
            fs.Position = startIndex;
            byte[] newTextBytes = Encoding.ASCII.GetBytes(newText);
            fs.Write(newTextBytes, 0, newTextBytes.Length);
        }
    }
Arnshea
I am using .NET 3.5. So, I am afraid, this might not be an option. Thanks for your suggestion.
Aamir
@Aamir: seek/write/flush can be done with typical file I/O; memory mapping isn't required. but if you do want to use memory mapping in 3.5, you don't need any special object library, you can call the native win dll interface (CreateFileMapping/MapViewOfFile/etc) directly, via System.Runtime.Interopservices and DLLImport.
joe snyder
This codeproject performs memory mapping in .NET 3.5. He's using arrays, but I bet it won't be hard to translate to files.http://www.codeproject.com/KB/recipes/MemoryMappedGenericArray.aspx
P.Brian.Mackey
@P.Brian, My project has moved to codeplex at http://mmf.codeplex.com/. Take a look at the modified version of Winterdom.IO.Filemap in the source code. Or look at the original memory mapped code at http://github.com/tomasr/filemap/.
Mikael Svenson
+3  A: 

Unless the new text is exactly the same size as the old text, you will have to re-write the file. There is no way around it. You can at least do this without keeping the entire file in memory.

Joel Coehoorn
Joel, the new text IS exactly the same size. What is the way to do this in such a case?
Aamir
Open a file stream, seek to the point where the text starts, and write the new text to the stream.
Joel Coehoorn
A: 

I'm guessing you'll want to use the FileStream class and seek to your positon, and place your updated data.

Nate Bross
+1  A: 

Hello I tested the following -works well.This caters to variable length lines separated by Environment.NewLine. if you have fixed length lines you can straightaway seek to it.For converting bytes to string and vice versa you can use Encoding.

 static byte[] ReadNextLine(FileStream fs)
        {
            byte[] nl = new byte[] {(byte) Environment.NewLine[0],(byte) Environment.NewLine[1] };
            List<byte> ll = new List<byte>();
            bool lineFound = false;
            while (!lineFound)
            {
                byte b = (byte)fs.ReadByte();
                if ((int)b == -1) break;
                ll.Add(b);
                if (b == nl[0]){
                    b = (byte)fs.ReadByte();
                    ll.Add(b);
                    if (b == nl[1]) lineFound = true;
                }
            }
          return  ll.Count ==0?null: ll.ToArray();
        }
       static void Main(string[] args)
       {

            using (FileStream fs = new FileStream(@"c:\70-528\junk.txt", FileMode.Open, FileAccess.ReadWrite))
            {
               int replaceLine=1231;
               byte[] b = null;
               int lineCount=1;
               while (lineCount<replaceLine && (b=ReadNextLine(fs))!=null ) lineCount++;//Skip Lines

               long seekPos = fs.Position;
               b = ReadNextLine(fs);
               fs.Seek(seekPos, 0);
              string line=new string(b.Select(x=>(char)x).ToArray());
              line = line.Replace("Text1", "Text2");
                b=line.ToCharArray().Select(x=>(byte)x).ToArray();
                fs.Write(b, 0, b.Length);

            }

        }
josephj1989