views:

915

answers:

7

I want to pose a seemingly simple question that i can't find the answer nowhere. Is there a FAST modern algorithm for file input and/or output that can be compiled with all standard compliant C++ compilers and works for all operating systems without the requirement of external libraries ?

  1. i have found that the fastest way is to use memory mapped files, but that won't do cause we want the same piece of code working on all platforms
  2. we can't use apis like Win32 API cause that will make it platform specific
  3. i don't want to use c, i want the algorithm to be just pure c++ code with stl if feasible, not some ugly c with intermixed asm hack/trick
  4. frameworks or external libraries that don't belong in standart c++ shouldn't be used like wxWidgets, Qt, MFC etc.
  5. The big empasis of this whole question is that the algorithm is as FAST as possible, something along the lines of the speed of doing it with memory mapped files, even faster would be great but i know that is not possible

Have you ever seen something so crazy been researched by anyone else except me ? Is such an algorithm even possible ?

Thanks for any recommendations

+10  A: 

With the following restrictions:

can be compiled with all standard compliant C++ compilers and works for all operating systems without the requirement of external libraries ?

You've pretty much restricted yourself to the standard library file IO functions. Maybe POSIX functions (depending on what subset of "all standard compliant C++ compilers" you're considering).

If they aren't fast enough for you, you'll have to start dropping some restrictions.

Michael Burr
Agreed. The only C++ standard file I/O is that from the standard library, ie. header file `<iostream>` or `<cstdio>`. If you want fast file I/O, the most important thing would be to **not** read/write files one byte at a time, but reading/writing large chunks in one go.
stakx
can the entire file be read all at once with iostream ? would that offer performance advantage, which i don't think so ? isn't cstdio a c header ?
Reading the entire file at once is still slower than mapping it into memory.
Anon.
`stdio.h` is the c header, `cstdio` is a c++ header declaring the same functionality. Ignoring it while you demanding blazing speed is, IMO, misguided. But that is your business.
dmckee
+9  A: 

This has nothing to do with an "algorithm".

When it comes to writing data to a file, you're at the mercy of the operating system - memory-mapped files are "fast" because you're just writing to memory, and the OS syncs it back out on its own time. If the OS doesn't support it, you're out of luck in that regard - unless you want to implement your own memory-mapping layer.

Incidentally, POSIX has mmap, so if you're limiting yourself to POSIX-compliant systems, you're fine.

Anon.
+2  A: 

Some points:

  • This has nothing to do with algorithms.
  • Wanting to target ALL operating systems is not a very productive goal (and it's impossible
  • your code doesn't work on a particular platform until you've tested it). Instead, I'd focus on some set of operating systems that are feasible - say POSIX + Win32.
  • In that case, you can do memory mapping, for instance by implementing mmap() for Windows (on top of MapViewOfFile() etc -- the git source code has a mmap implementation for Windows if you need some inspiration)
  • If you can't use memory mapping, I'd recommend using the normal C file api instead of C++'s file streams if performance is a big deal. Even though C++'s streams have the potential of higher performance for some operations, in practice it is quite a bit slower.
  • But, to get good performance, it can often be "good enough" just to make sure you're treating your data in a sane way. Read data sequentially, don't re-read, etc. Perfect is the enemy of good ;)
kusma
@kusma: I'd be interested to see performance comparisons between C's `stdio` and C++'s `iostream` functionalities. Can you by any chance point me to such resources? 'cause I once looked at the C++ `iostream` library source code (more specifically, the implementation that comes with MinGW) and had the impression that it is actually only a very thin (template-based) wrapper around the C file I/O functions. Therefore I wouldn't expect to see any significant difference in performance.
stakx
The enemy of god? Yikes... :)
Drew Hall
stakx: Unfortunately, no. The results are long lost. I did some profiling on MSVC back around 2001, after a friend reported significant slow-downs on GCC/Linux from using std::istream (compared to the C APIs). IIRC, we both found that the performance drop were about the same, around 30%. It might be that memory serves me wrong about the figures, though.Drew: Sorry, I meant "the enemy of satan", of course ;)
kusma
A: 

The other posters are correct in that performance is always at odds with genericity (cross-platform).

However, in general, you'll get the best results by "buffering" your input -- using fread() to read in relatively large chunks of data, and processing those.

I know that's a pretty basic and general answer, but that's about as specific as you can get without either being more platform-specific, or knowing more about the specific input you're processing.

+1  A: 

Reading sequentially in blocks that are multiples (or powers of 2) of the file system block size may help. Then pick apart your data once the block is in memory. There is a white paper somewhere where they tested performance for various block sizes. I wish I could find it again.

You might also try having a dedicated thread for reading blocks from the file, and another that does the data manipulations in memory (with the proper synchronizations, of course). This allows you to use the CPU to process data while you're blocking on your file read calls.

Anyways, if you give these ideas a try, please let us know if you notice a difference. Actual results from your benchmark tests would be interesting.

Emile Cormier
Most hard drives use block sizes that are a multiple of 512 bytes. With the larger hard drives, it may be 1024 or larger.
Thomas Matthews
The buffer sizes used by the layers above/below may be different than the filesystem's. Experiment with different powers of 2 (don't limit yourself to 512/1024, go all the way up to, say, 256kiB). I remember 4kiB being a magic number used in one of those layers in Linux, but I don't remember where. Sorry I can't be more specific.
Emile Cormier
4k is the default size of a filesystem block.
Anon.
A: 

Fast IO generally boils down to two things:

  1. Minimize data copying
  2. Minimize kernel/user context switching

Most all IO techniques attempt to address one or the other. The fastest cross-platform code for IO that I know of is the Perl IO system. I'd suggest taking a look at the source. Perl hackers have spent decades getting their IO as fast as possible on as many platforms as possible.

brianegge
+2  A: 

To put another perspective on "mercy of the OS", most of the overhead in copying files lies with the operating system. A fragmented file will take more time to read than a defragmented file. There are no generic or standard C++ functions for detecting fragmented files.

The fastest method in C++:

std::ifstream in_file;
std::ofstream out_file;

out_file << in_file.rdbuf();

You can find more details by searching the web with the keywords "copy file rdbuf". The above fragment leaves the copying up to the OS, but is portable across all platforms. By reading into the C++ i/o streams, you can set the size of the read buffer, or have it use your own buffer.

Faster file copying requires platform specific functionality, such as DMA transfers. Using threads and multiple buffering, could speed this up; but C++ has no support for threads (there is a defacto standard, POSIX, which does support threads). One thread would read into buffers while another thread writes from the buffers.

Thomas Matthews