tags:

views:

874

answers:

7

I have a large vector (10^9 elements) of chars, and I was wondering what is the fastest way to write such vector to a file. So far I've been using next code:

vector<char> vs;
// ... Fill vector with data
ofstream outfile("nanocube.txt", ios::out | ios::binary);
ostream_iterator<char> oi(outfile, '\0');
copy(vs.begin(), vs.end(), oi);

For this code it takes approximately two minutes to write all data to file. The actual question is: "Can I make it faster using STL and how"?

+3  A: 

There is a slight conceptual error with your second argument to ostream_iterator's constructor. It should be NULL pointer, if you don't want a delimiter (although, luckily for you, this will be treated as such implicitly), or the second argument should be omitted.

However, this means that after writing each character, the code needs to check for the pointer designating the delimiter (which might be somewhat inefficient).

I think, if you want to go with iterators, perhaps you could try ostreambuf_iterator.

Other options might include using the write() method (if it can handle output this large, or perhaps output it in chunks), and perhaps OS-specific output functions.

UncleBens
I just read the section in Meyer's "Effective STL" that mentions the `[io]streambuf_iterator` classes. Perfect for this!
Tom
Thnx for correction. I did copy paste from somewhere without deeper insight.
ljubak
I forgot to say that I'm trying to make things platform independent so OS specific is out of question, but thnx again.
ljubak
A: 

Use the write method on it, it is in ram after all and you have contigous memory.. Fastest, while looking for flexibility later? Lose the built-in buffering, hint sequential i/o, lose the hidden things of iterator/utility, avoid streambuf when you can but do get dirty with boost::asio ..

rama-jka toti
+7  A: 

With such a large amount of data to be written (~1GB), you should write to the output stream directly, rather than using an output iterator. Since the data in a vector is stored contiguously, this will work and should be much faster.

ofstream outfile("nanocube.txt", ios::out | ios::binary);
outfile.write(&vs[0], vs.size());
Charles Salvia
bk1e
Right - post edited.
Charles Salvia
+1  A: 

Since your data is contiguous in memory (as Charles said), you can use low level I/O. On Unix or Linux, you can do your write to a file descriptor. On Windows XP, use file handles. (It's a little trickier on XP, but well documented in MSDN.)

XP is a little funny about buffering. If you write a 1GB block to a handle, it will be slower than if you break the write up into smaller transfer sizes (in a loop). I've found the 256KB writes are most efficient. Once you've written the loop, you can play around with this and see what's the fastest transfer size.

Rob deFriesse
+1  A: 

OK, I did write method implementation with for loop that writes 256KB blocks (as Rob suggested) of data at each iteration and result is 16 seconds, so problem solved. This is my humble implementation so feel free to comment:

 void writeCubeToFile(const vector<char> &vs)
 {
     const unsigned int blocksize = 262144;
     unsigned long blocks = distance(vs.begin(), vs.end()) / blocksize;

     ofstream outfile("nanocube.txt", ios::out | ios::binary);

     for(unsigned long i = 0; i <= blocks; i++)
     {
         unsigned long position = blocksize * i;

         if(blocksize > distance(vs.begin() + position, vs.end())) outfile.write(&*(vs.begin() + position), distance(vs.begin() + position, vs.end()));
         else outfile.write(&*(vs.begin() + position), blocksize);
     }

     outfile.write("\0", 1);

     outfile.close();
}

Thnx to all of you.

ljubak
A: 

If you have other structure this method is still valid.

For example:

typedef std::pair<int,int> STL_Edge;
vector<STL_Edge> v;

void write_file(const char * path){
   ofstream outfile(path, ios::out | ios::binary);
   outfile.write((const char *)&v.front(), v.size()*sizeof(STL_Edge));
}

void read_file(const char * path,int reserveSpaceForEntries){
   ifstream infile(path, ios::in | ios::binary);
   v.resize(reserveSpaceForEntries);
   infile.read((char *)&v.front(), v.size()*sizeof(STL_Edge));
}
tomekpe
A: 

Instead of writing via the file i/o methods, you could try to create a memory-mapped file, and then copy the vector to the memory-mapped file using memcpy.

Patrick