views:

431

answers:

2

I have to transfer large files between computers on via unreliable connections using WCF.

Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. These "chunk" are transported as stream. Which works quite nice, so far.

My steps are:

  1. open filestream
  2. read chunk from file into byet[] and create memorystream
  3. transfer chunk
  4. back to 2. until the whole file is sent

My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. I could not actually create this error, maybe I am wrong in my assumption.

Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution.

My actual question(s):

  • Will my current solution create objects on the LOH and will that cause me problem?
  • Is there a better way to solve this?

Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved.

Edit:

current solution:

for (int i = resumeChunk; i < chunks; i++)
{
 byte[] buffer = new byte[chunkSize];
 fileStream.Position = i * chunkSize;
 int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
 Array.Resize(ref buffer, actualLength);
 using (MemoryStream stream = new MemoryStream(buffer)) 
 {
  UploadFile(stream);
 }
}
+1  A: 

I'm not so sure about the first part of your question but as for a better way - have you considered BITS? It allows background downloading of files over http. You can provide it a http:// or file:// URI. It is resumable from the point that it was interrupted and downloads in chunks of bytes using the RANGE method in the http HEADER. It is used by Windows Update.You can subscribe to events that give information on progress and completion.

Peter Kelly
Thanks for your suggestion, but I don't want to install IIS on every machine.
Flo
No problem, just a thought. Just to point out you only need IIS on every machine if they are uploading. If the clients are only downloading using BITS then they do not require IIS.
Peter Kelly
+8  A: 

Hi, I hope this is okay. It's my first answer on StackOverflow.

Yes absolutely if your chunksize is over 85000 bytes then the array will get allocated on the large object heap. You will probably not run out of memory very quickly as you are allocating and deallocating contiguous areas of memory that are all the same size so when memory fills up the runtime can fit a new chunk into an old, reclaimed memory area.

I would be a little worried about the Array.Resize call as that will create another array (see http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx). This is an unecessary step if actualLength==Chunksize as it will be for all but the last chunk. So I would as a minimum suggest:

if (actualLength != chunkSize) Array.Resize(ref buffer, actualLength);

This should remove a lot of allocations. If the actualSize is not the same as the chunkSize but is still > 85000 then the new array will also be allocated on the Large object heap potentially causing it to fragment and possibly causing apparent memory leaks. It would I believe still take a long time to actually run out of memory as the leak would be quite slow.

I think a better implementation would be to use some kind of Buffer Pool to provide the arrays. You could roll your own (it would be too complicated) but WCF does provide one for you. I have rewritten your code slightly to take advatage of that:

BufferManager bm = BufferManager.CreateBufferManager(chunkSize * 10, chunkSize);

for (int i = resumeChunk; i < chunks; i++)
{
    byte[] buffer = bm.TakeBuffer(chunkSize);
    try
    {
        fileStream.Position = i * chunkSize;
        int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
        if (actualLength == 0) break;
        //Array.Resize(ref buffer, actualLength);
        using (MemoryStream stream = new MemoryStream(buffer))
        {
            UploadFile(stream, actualLength);
        }
    }
    finally
    {
        bm.ReturnBuffer(buffer);
    }
}

this assumes that the implementation of UploadFile Can be rewritten to take an int for the no. of bytes to write.

I hope this helps

joe

Joe90
Thanks a lot!This is seriously a brillian answer. Thanks for pointing out the Array.Resize issue. I also never heard of the BufferManager, sounds like this will help me lot in other areas, too.This is a lot more than I expected so I thought about starting a small bounty and giving it to you but I have to wait 23h after starting a bounty... So you have to wait, too :)
Flo
Thanks for that. I'm really glad I can be of help. Let me know if there's anything else. Looking back at it, it might be worth pointing out that the optimum implementation would share a single instance of the BufferManager across the whole service. I don't know how practical that would be for you.
Joe90