views:

787

answers:

5

The situation is: there is a file with 14 294 508 unsigned integers and 13 994 397 floating-point numbers (need to read doubles). Total file size is ~250 MB.

Using std::istream takes ~30sec. Reading the data from file to memory (just copying bytes, without formatted input) is much faster. Is there any way to improve reading speed without changing file format?

+3  A: 

Do you need to use STL style i/o? You must check out this excellent piece of work from one of the experts. It's a specialized iostream by Dietmar Kuhl.

I hate to suggest this but take a look at the C formatted i/o routines. Also, are you reading in the whole file in one go?

dirkgently
Syntax and approach doesn't matter :) And yes, i'm reading the whole file.
goodrone
Have you tried fscanf and friends? I'd say give these a shot, and measure.
dirkgently
+1  A: 

You might also want to look at Matthew Wilson's FastFormat library:

I haven't used it, but he makes some pretty impressive claims and I've found a lot of his other work to be worth studying and using (and stealing on occasion).

Michael Burr
Does it support formatted input?
goodrone
Crap - you're right... It's output formatting only.
Michael Burr
Maybe the techniques can be applied to input
dcw
+1  A: 

You haven't specified the format. It's possible that you could memory map it, or could read in very large chunks and process in a batch algorithm.

Also, you haven't said whether you know for sure that the file and process that will read it will be on the same platform. If a big-endian process writes it and a little-endian process reads it, or vice versa, it won;t work.

dcw
+1  A: 

Parsing input by yourself (atoi & atof), usually boosts speed at least twice, compared to "universal" read methods.

n0rd
A: 

Something quick and dirty is to just dump the file into a standard C++ string, and then use a stringstream on it:

#include <sstream>
// Load file into string file_string
std::stringstream s( file_string );
int x; float y;
s >> x >> y;

This may not give you much of a performance improvement (you will get a larger speed-up by avoiding iostreams), but it's very easy to try, and it may be faster enough.

AHelps