ansaurus

Question

Performance of System.IO.ReadAllxxx / WriteAllxxx methods

Answer 1

+3 A:

The File.ReadAllText and similar methods use StreamReader/Writers internally, so performance should be comparable to whatever you do yourself.

I'd say go with the File.XXX methods whenever possible, it makes your code a) easier to read b) less likely to contain bugs (in any impl you write yourself).

Fredrik Kalseth 2008-10-03 11:52:14

Thank you very much for you answer. I was also thinking in the same line. But got confused when I saw the MSDN page I mentioned in my question.

Vijesh VP 2008-10-03 13:02:57

Answer 2

A:

@Fredrik Kalseth is right. File.ReadXXX methods are just convenient wrappers around StreamReader class.

For example here is an implementation of File.ReadAllText

public static string ReadAllText(string path, Encoding encoding)
{
    using (StreamReader reader = new StreamReader(path, encoding))
    {
        return reader.ReadToEnd();
    }
}

aku 2008-10-03 12:05:27

Answer 3

+5 A:

You probably don't want to use File.ReadAllxxx / WriteAllxxx if you have any intention to support loading / saving of really large files.

In other words, for an editor which you intend to remain usable when editing gigabyte size files, you want some design with StreamReader/StreamWriter and seeking, so you load only the part of the file that is visible.

For anything without these (rare) requirements, I'd say take the easy route and use File.ReadAllxxx / WriteAllxxx. They just use the same StreamReader/Writer pattern internally as you'd code by hand anyway, as aku shows.

Tobi 2008-10-03 12:07:12

Answer 4

+1 A:

Unless you are doing something such as applying a regular expression that is multiline matching to a text file you generally want to avoid the ReadAll/WriteAll. Doing things in smaller more manageable chunks will almost always result in better performance.

For example, reading a table from a database and sending it to a client's web browser should be done in small sets that utilize the nature of small network messages and reduce the usage of the processing computer's memory. There's no reason to buffer 10,000 records in memory on the web server and dump it all at once. Same thing goes for file systems. If you are concerned with write performance of many small amounts of data - such as what goes on in the underlying file system for allocating space and what's the overhead - you may find these articles enlightening:

Windows File Cache Usage

File Read Benchmarks

Clarification: if you are doing a ReadAll followed by a String.Split('\r') to get an array of all the lines in the file, and the using a for loop to process each line that's code which will generally result in worse performance than reading the file line by line and performing your process on each line. This isn't a hard rule - if you have some processing that takes a large chunk of time its often better to release system resources (the file handle) sooner than later. However in regards to writing files its almost always better to dump the results of any transformative process (such as invoking ToString() on a large list of items) per item than buffering it in memory.

cfeduke 2008-10-03 12:13:58

Answer 5

A:

The others have explained the performance so I won't add to it, however I will add that it is likely that the MSDN code sample was written before .NET 2.0 when the helper methods were not available.

Richard Szalay 2008-10-03 12:38:02

@Richard I was also thinking that. I just wanted to confirm I'm not overlooking anything here. thanks for your answer.

Vijesh VP 2008-10-03 13:04:59

Answer 6

A:

This link has benchmarks for reading 50+K Lines, and indicates that a streamreader is about 40% faster.

http://dotnetperls.com/Content/File-Handling.aspx

torial 2008-10-04 03:44:05

Answer 7

+1 A:

This MSR (Microsoft Research) paper is a good start, they also document a number of point tools like, IOSpeed, FragDisk, etc... which you can use and test in your envrionment.

There is also an updated report/presentation you can read about how to maximise sequential IO. Very interesting stuff as they debunk, the "moving the HD head is the most time consuming operation" myth, they also document fully their test envrionments and associated configurations, down to the motherboard, raid controller and virtually any relivent information for you to replicate their work. Some of the highlights are how an Opteron / XEON matched up, but they then also compared them to an insane\hype NEC Itanium (32 or 64 proc or something) for measure. From the second link here you can find a lot more resources around how to test and evaluate high-throughput scenerio's and needs.

Some of the other MSR paper's in this same research topic involve guidieance about where to maximise your spending, (e.g. RAM, CPU, Disk Spindals... etc..) to accomidate your usage patterns... all very neat.

However some of it is dated, but usually older-API's are the faster/low-level ones anyhow ;)

I currently push hundreds of thousands of TPS on a purpose built app server, using a mix of C#, C++/CLI, native code and bitmap Caching (rtl*bitmap).

Take care;

RandomNickName42 2009-05-15 06:33:32

ansaurus

tags:

views:

answers:

Performance of System.IO.ReadAllxxx / WriteAllxxx methods

related questions