views:

193

answers:

7

I have a program that opens a large binary file, appends a small amount of data to it, and closes the file.

FileStream fs = File.Open( "\\\\s1\\temp\\test.tmp", FileMode.Append, FileAccess.Write, FileShare.None );
fs.Write( data, 0, data.Length );
fs.Close();

If test.tmp is 5MB before this program is run and the data array is 100 bytes, this program will cause over 5MB of data to be transmitted across the network. I would have expected that the data already in the file would not be transmitted across the network since I'm not reading it or writing it. Is there any way to avoid this behavior? This makes it agonizingly slow to append to very large files.

A: 

You could cache your data into a local buffer and periodically (much less often than now) append to the large file. This would save on a bunch of network transfers but... This would also increase the risk of losing that cache (and your data) in case your app crashes.

Logging (if that's what it is) of this type is often stored in a db. Using a decent RDBMS would allow you to post that 100 bytes of data very frequently with minimal overhead. The caveat there is the maintenance of an RDBMS.

Paul Sasik
A: 

I did some googling and was looking more at how to read excessively large files quickly and found this link http://www.4guysfromrolla.com/webtech/010401-1.shtml

The most interesting part there would be the part about byte reading: Besides the more commonly used ReadAll and ReadLine methods, the TextStream object also supports a Read(n) method, where n is the number of bytes in the file/textstream in question. By instantiating an additional object (a file object), we can obtain the size of the file to be read, and then use the Read(n) method to race through our file. As it turns out, the "read bytes" method is extremely fast by comparison:

const ForReading = 1
const TristateFalse = 0
dim strSearchThis
dim objFS
dim objFile
dim objTS
set objFS = Server.CreateObject("Scripting.FileSystemObject")
set objFile = objFS.GetFile(Server.MapPath("myfile.txt"))
set objTS = objFile.OpenAsTextStream(ForReading, TristateFalse)

strSearchThis = objTS.Read(objFile.Size)

if instr(strSearchThis, "keyword") > 0 then
Response.Write "Found it!"
end if

This method could then be used by you to go to the end of the file and manually appending it instead of loading the entire file in append mode with a filestream.

Gustav Syrén
-1: Did you notice this was an article from 2001, for VBScript, not for .NET?
John Saunders
A: 

If you have system access or perhaps a friendly admin for the machine actually hosting the file you could make a small listener program that sits on the other end.

You make a call to it passing just the data to be written and it does the write locally, avoiding the extra network traffic.

ManiacZX
A: 

The File object in .NET has quite a few static methods to handle this type of thing. I would suggest trying:

File file = File.AppendAllText("FilePath", "What to append", Encoding.UTF8);

When you reflect this method it turns out that it's using:

  using (StreamWriter writer = new StreamWriter(path, true, encoding))
{
    writer.Write(contents);
}

This StreamWriter method should allow you to simply append something to the end (at least this is the method I've seen used in every instance of logging that I've encountered so far).

highphilosopher
Have you tested the performance of this?
John Saunders
AppendAllText returns void, not File :)http://msdn.microsoft.com/en-us/library/system.io.file.appendalltext.aspx
James Manning
+1  A: 

I found this on MSDN (CreateFile is called internally):

When an application creates a file across a network, it is better to use GENERIC_READ | GENERIC_WRITE for dwDesiredAccess than to use GENERIC_WRITE alone. The resulting code is faster, because the redirector can use the cache manager and send fewer SMBs with more data. This combination also avoids an issue where writing to a file across a network can occasionally return ERROR_ACCESS_DENIED.

Using Reflector, FileAccess maps to dwDesiredAccess, so it would seem to suggest using FileAccess.ReadWrite instead of just FileAccess.Write.

I have no idea if this will help :)

Porges
A: 

Write the data to separate files, then join them (do it on the hosting machine if possible) only when necessary.

rwong
I say this, because the "long answer" is really long, http://en.wikipedia.org/wiki/Server_Message_Block and there is no way around it. It's a limitation of Windows file share system.
rwong
+1  A: 

0xA3 provided the answer in a commment above. The poor performance was due to an on-access virus scan. Each time my program opened the file, the virus scanner read the entire contents of the file to check for viruses even though my program didn't read any of the existing content. Disabling the on-access virus scan eliminated the excessive network I/O and the poor performance.

Thanks to everyone for your suggestions.

BradVoy