views:

223

answers:

6

I am writing a simple web service using .NET, one method is used to send a chunk of a file from the client to the server, the server opens a temp file and appends this chunk. The files are quite large 80Mb, the net work IO seems fine, but the append write to the local file is slowing down progressively as the file gets larger.

The follow is the code that slows down, running on the server, where aFile is a string, and aData is a byte[]

       using (StreamWriter lStream = new StreamWriter(aFile, true))
       {
            BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
            lWriter.Write(aData);
       }

Debugging this process I can see that exiting the using statement is slower and slower.

If I run this code in a simple standalone test application the writes are the same speed every time about 3 ms, note the buffer (aData) is always the same side, about 0.5 Mb.

I have tried all sorts of experiments with different writers, system copies to append scratch files, all slow down when running under the web service.

Why is this happening? I suspect the web service is trying to cache access to local file system objects, how can I turn this off for specific files?

More information -

If I hard code the path the speed is fine, like so

       using (StreamWriter lStream = new StreamWriter("c:\\test.dat", true))
       {
            BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
            lWriter.Write(aData);
       }

But then it slow copying this scratch file to the final file destination later on -

       File.Copy("c:\\test.dat", aFile);

If I use any varibale in the path it gets slow agin so for example -

       using (StreamWriter lStream = new StreamWriter("c:\\test" + someVariable, true))
       {
            BinaryWriter lWriter = new BinaryWriter(lStream.BaseStream);
            lWriter.Write(aData);
       }

It has been commented that I should not use StreamWriter, note I tried many ways to open the file using FileStream, none of which made any change when the code is running under the web service, I tried WriteThrough etc.

Its the strangest thing I even tried this -

Write the data to file a.dat
Spawn system "cmd" "copy /b b.dat + a.dat b.dat" 
Delete a.dat

This slows down the same way????

Makes me think the web server is running in some protected file IO environment catching all file operations in this process and child process, I can understand this if I was generating a file that might be later served to a client, but I am not, what I am doing is storing large binary blobs on disk, with a index/pointer to them stored in a database, if I comment out the write to the file the whole process fly's no performance issues at all.

I started reading about web server caching strategies, makes me think is there a web.config setting to mark a folder as uncached? Or am I completely barking up the wrong tree.

A: 

A long shot: is it possible that you need close some resources when you have finished?

djna
I don't think so, to me it looks like the web server process is reading the entire file into cache when the file is closed.
titanae
A: 

If the file is binary, then why are you using a StreamWriter, which is derived from TextWriter? Just use a FileStream.

Also, BinaryWriter implements IDisposable, You need to put it into a using block.

John Saunders
Yes/No, because StreamWriter opens the file, it closes it, I tried explicitly calling close, and using FileStream to open the file, they have no effect, when the StreamWriter gets disposed is when I see the slow down, 10 ms first append, 20 ms next etc, when the file gets beyond 10Mb it gets really slow over 2 seconds.
titanae
A: 

Update....I replicated the basic code, no database, simple and it seems to work fine, so I suspect there is another reason, I will rest on it over the weekend....

Here is the replicated server code -

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.Services;
using System.IO;
namespace TestWS
{
/// <summary>
/// Summary description for Service1
/// </summary>
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[System.ComponentModel.ToolboxItem(false)]
// To allow this Web Service to be called from script, using ASP.NET AJAX, uncomment the following line. 
// [System.Web.Script.Services.ScriptService]
public class Service1 : System.Web.Services.WebService
{

    private string GetFileName ()
    {
        if (File.Exists("index.dat"))
        {
            using (StreamReader lReader = new StreamReader("index.dat"))
            {
                return lReader.ReadLine();
            }
        }
        else
        {
            using (StreamWriter lWriter = new StreamWriter("index.dat"))
            {
                string lFileName = Path.GetRandomFileName();
                lWriter.Write(lFileName);
                return lFileName;
            }
        }
    }

    [WebMethod]
    public string WriteChunk(byte[] aData)
    {
        Directory.SetCurrentDirectory(Server.MapPath("Data"));
        DateTime lStart = DateTime.Now;
        using (FileStream lStream = new FileStream(GetFileName(), FileMode.Append))
        {
            BinaryWriter lWriter = new BinaryWriter(lStream);
            lWriter.Write(aData);
        }
        DateTime lEnd = DateTime.Now;
        return lEnd.Subtract(lStart).TotalMilliseconds.ToString();
    }
}

}

And the replicated client code -

    static void Main(string[] args)
    {
        Service1 s = new Service1();
        byte[] b = new byte[1024 * 512];
        for ( int i = 0 ; i < 160 ; i ++ )
        {
            Console.WriteLine(s.WriteChunk(b));
        }
    }
titanae
I think it might be VisualStudio, running the client in the debugger, and web service detached seems fast, running the opposite way round is slow i.e. debug web service, client detached! This makes sense sort of, VisualStudio trying to intercept everything...no convinced yet.
titanae
A: 

Based on your code, it appears you're using the default handling inside of StreamWriter for files, which means synchronous and exclusive locks on the file.

Based on your comments, it seems the issue you really want to solve is the return time from the web service -- not necessarily the write time for the file. While the write time is the current gating factor as you've discovered, you might be able to get around your issue by going to an asynchronous-write mode.

Alternatively, I prefer completely de-coupled asynchronous operations. In that scenario, the inbound byte[] of data would be saved to its own file (or some other structure), then appended to the master file by a secondary process. More complex for operation, but also less prone to failure.

jro
I am starting to think it might be GC working, since while the file is uploading multiple other clients are firing requests at the server which involve data extraction, I think I might need to use the "using" statement more liberally? I will find out next week.
titanae
A: 

I don't have enough points to vote up an answer, but jro has the right idea. We do something similar in our service; each chunk is saved to a single temp file, then as soon as all chunks are received they're reassembled into a single file.

I'm not certain on the underlying processes for appending data to a file using StreamWriter, but I would assume it would have to at least read to the end of the current file before attempting to write whatever is in the buffer to it. So as the file gets larger it would have to read more and more of the existing file before writing the next chunk.

Jacob Ewald
Possibly, but the scenario is the client is uploading data that must be stored in a file, once complete the client will ask the server to act on the data, the server does this by launching an external process on the server to process the data in the file. I could thread the write to the file, but eventually the client will have to sync with the server, client completed upload, server acknowledge receipt of file, client instruct server to process the file, their may be some delay in the later part since the client could upload multiple files.
titanae
A: 

Well I found the root cause, "Microsoft Forefront Security", group policy has this running real time scanning, I could see the process goto 30% CPU usage when I close the file, killing this process and everything works the same speed, outside and inside the web service!

Next task find a way to add an exclusion to MFS!

titanae