views:

46

answers:

3

I want to be able to read and write a large file in parallel, or if not in parallel, at least in blocks so that I don't use up so much memory.

This is my current code:

        // Define memory stream which will be used to hold encrypted data.
        MemoryStream memoryStream = new MemoryStream();

        // Define cryptographic stream (always use Write mode for encryption).
        CryptoStream cryptoStream = new CryptoStream(memoryStream,
                                                     encryptor,
                                                     CryptoStreamMode.Write);

        //start encrypting
        using (BinaryReader reader = new BinaryReader(File.Open(fileIn, FileMode.Open))) {
            byte[] buffer = new byte[1024 * 1024];
            int read = 0;

            do {
                read = reader.Read(buffer, 0, buffer.Length);
                cryptoStream.Write(buffer, 0, read);
            } while (read == buffer.Length);

        }
        // Finish encrypting.
        cryptoStream.FlushFinalBlock();

        // Convert our encrypted data from a memory stream into a byte array.
        //byte[] cipherTextBytes = memoryStream.ToArray();

        //write our memory stream to a file
        memoryStream.Position = 0;
        using (BinaryWriter writer = new BinaryWriter(File.Open(fileOut, FileMode.Create))) {
            byte[] buffer = new byte[1024 * 1024];
            int read = 0;

            do {
                read = memoryStream.Read(buffer, 0, buffer.Length);
                writer.Write(buffer, 0, read);
            } while (read == buffer.Length);
        }


        // Close both streams.
        memoryStream.Close();
        cryptoStream.Close();

As you can see, it reads the entire file into memory, encrypts it, then writes it out. If I happen to be encrypting files that are very large (2GB+) it tends not to work, or at the very least, consumes ~97% of my memory.

How could I do it in a more effective manner?

+1  A: 

Instead of hooking up the CryptoStream to a MemoryStream, have it write to the output FileStream. You shouldn't need a MemoryStream at all.

Update: It is more efficient to process files sequentially, rather than in parallel. So I don't recommend a parallel read/write situation; just get rid of the MemoryStream.

Stephen Cleary
+1  A: 

The simple, obvious solution is to have the CryptoStream write to a temporary file, then rename the temp file to the old file when you're done. This will get rid of your memory problem and give you a transient disk space problem :), but that's something you can probably work around more easily.

JSBangs
A: 

Although it requires some tricky orchestration, you can create two seperate filestream operations that run in parallel... one reading and one writing. Another alternative is to create a memory-mapped file and do the same. Each stream can be optimized for its particular needs (e.g. reader could seek, with the writer could be a forward only writer).

JoeGeeky