views:

214

answers:

6

Hello,

I'm trying to implement a filecopy method that can match the performance a copy done with the windows explorer.

For exemple a copy (with the windows explorer) from our nas to my computer, performs above 100mb/sec.

My current implementation does the same copy at about 55mb/sec which is already better than the System.IO.File.Copy() which performs at 29mb/sec.

static void Main(string[] args)
    {
        String src = @"";
        String dst = @"";

        Int32 buffersize = 1024 * 1024;
        FileStream input = new FileStream(src, FileMode.Open, FileAccess.Read, FileShare.None, 8, FileOptions.Asynchronous | FileOptions.SequentialScan);
        FileStream output = new FileStream(dst, FileMode.CreateNew, FileAccess.Write, FileShare.None, 8, FileOptions.Asynchronous | FileOptions.SequentialScan);

        Int32 readsize = -1;
        Byte[] readbuffer = new Byte[buffersize];
        IAsyncResult asyncread;
        Byte[] writebuffer = new Byte[buffersize];
        IAsyncResult asyncwrite;

        DateTime Start = DateTime.Now;

        output.SetLength(input.Length);

        readsize = input.Read(readbuffer, 0, readbuffer.Length);
        readbuffer = Interlocked.Exchange(ref writebuffer, readbuffer);

        while (readsize > 0)
        {
            asyncwrite = output.BeginWrite(writebuffer, 0, readsize, null, null);
            asyncread = input.BeginRead(readbuffer, 0, readbuffer.Length, null, null);

            output.EndWrite(asyncwrite);
            readsize = input.EndRead(asyncread);
            readbuffer = Interlocked.Exchange(ref writebuffer, readbuffer);
        }

        DateTime Stop = DateTime.Now;

        TimeSpan Duration = Stop - Start;
        Double speed = input.Length / Duration.TotalSeconds; // bytes/s

        System.Console.WriteLine("MY Speed : " + (speed / 1024 / 1024).ToString() + " mo/sec");

        input.Close();
        output.Close();
        System.IO.File.Delete(dst);
    }

Any idea how to enhance the performance ?

EDIT :

The file is read from a linux-based nas with a 10 Gigabit Ethernet interface with a 60 drives san behind (don't worry about its performances, it works very well) and written to a local raid0 which can write data at about 140MB/sec.

The bottleneck is the destination's gigabit network interface which I'm unable to reach with my current code.

Also, removing the write will not make the read any faster, so I can't go past this 55MB/sec read limit.

EDIT 2 :

The speed issue is related to the fact that the source file is stored on a network share. Only reading from my local drive with my piece of code provides me a 112MB/sec speed.

EDIT 3 :

Samba doesn't seem to be the issue. I replaced the cifs share (samba) with a nfs share on my linux nas and got worse results than with samba on my win7 client.

With nfs, my copy method and the windows explorer had the same performance, around 42MB/sec.

I'm out of ideas...

EDIT 4 :

Just to be sure windows was the issue, I've installed a debian lenny, mounted my nas trough nfs and got 79MB/sec with the same code under mono.

+3  A: 

Try changing the buffer size to equal the sector size on the hard disk - likely 4Kb. Also use the System.Diagnostics.Stopwatch class for timing.

I also wouldn't bother using the async methods in a tight loop - it will incur some overhead going away and allocating a thread from the pool to do the work.

Again also, make use of the using statement for managing the disposal of your streams. Note however that this will skew your timing as you are currently disposing the objects after stopping the timer.

Adam
Small multiples of sector sizes also work well... Testing anything from 1x to 16x might show a significant gain in one of them.
codekaizen
Doing only a read synchronously didn't provide any better performance.
Altar
I'm not sure reading from the stream and writing to another stream asynchronously at the same time is any faster either - you'd be lumbering the I/O with two tasks at the same time. For smaller files, have you tried reading the lot into RAM and then writing? Also, you can write in larger chunks or all at once, so buffering isn't strictly required.
Adam
+1  A: 

There are the usual suspects for increasing speed over a network:

  • Have multiple download threads
  • Pre-allocate the block on the disk where the file will reside

Other than that, you're at the mercy of your hardware limitations.

codekaizen
Multiple download threads would only work if the bottleneck was not the I/O, which is likely is.
Adam
@Adam - it could be the network, in which case it would be able to increase perf slightly. As with anything perf-related, though, it's about testing and measuring the various techniques.
codekaizen
+3  A: 

Did you try smaller buffer sizes? A buffer size of 1mb is awfully huge and normally buffer sizes of 4-64kb give you best performance.

Also, this may be related to your question: http://stackoverflow.com/questions/955911/how-to-write-super-fast-file-streaming-code-in-c

And maybe you can improve performance using memory mapped files: http://weblogs.asp.net/gunnarpeipman/archive/2009/06/21/net-framework-4-0-using-memory-mapped-files.aspx

inflagranti
Using a 64KB buffer size makes the copy go slower about 46MB/sec.
Altar
So upping it to 2mb, 4mb, or more increases performancefurther? Does performance increase from 64 to 128? Maybe there is a sweespot between 64k and 1mb where performance is optimal?
inflagranti
After 256KB there doesn't seem to be any increase in performance, always around 60MB/sec.
Altar
+1  A: 

File.Copy() is simply calling the CopyFile() API, you could try p/invoke SHFileOperation() which is what the shell uses - it often seems faster.

Alex K.
This could be an option if I was only copying a file but I must also be able to write the output to multiple destinations.
Altar
+1  A: 

For a deeper understanding of the design options and the tradeoffs involved in file copying, with and without network shares, I'd suggest you take a look at Mark Russinovich's blog post a couple of years ago. There are plenty more wrinkles involved than just the hard disk sector sizes, e.g...

  • The packet size of the SMB protocol
  • How much memory you are willing to use for cacheing (which may slow down other processes)
  • Whether you can sacrifice reliability for speed
  • Whether you want to increase the perceived speed or the actual time-until-finished
  • Where and how much you want to cache at several possible levels
  • Whether you're interested in providing reliable feedback and time estimates
Pontus Gagge
A: 

Probably the only faster option would be using an unbuffered IO: ReadFile Function, CreateFile Function, WriteFile Function with the FILE_FLAG_NO_BUFFERING flag with 2-6 MB buffer.

Also this way you would have to align the buffer size with file system sector size, etc.

It would be significantly faster - especially in Windows XP.

btw. I have achieved ~400 MB bandwidth on a striped RAID 0 array this way (using 4MB buffer).

Jaroslav Jandek
How would unbuffered I/O help with NAS storage? Surely network and hard drive access costs would dominate. Eliminating buffering could very well slow down performance instead, when multiple bottlenecks are present.
Pontus Gagge
I haven't noticed the NAS part.You only eleminate automatic buffering by the system. You still buffer the data but you choose the buffer size. For slow disks, it does not matter much, but for fast raid arrays and Tb networks unbuffered IO performance is very noticeable - not to mention it is easy on CPU. I do not use it if I do not have to though, too much WinAPI handling. Anyway, you can use unbuffered IO with NAS.
Jaroslav Jandek