The usual way to read a file in C++ is this one:
std::ifstream file("file.txt", std::ios::binary | std::ios::ate);
std::vector<char> data(file.tellg());
file.seekg(0, std::ios::beg);
file.read(data.data(), data.size());
Reading a 1.6 MB file is almost instant.
But recently, I discovered std::istream_iterator and wanted to try it in order to code a beautiful one-line way to read the content of a file. Like this:
std::vector<char> data(std::istream_iterator<char>(std::ifstream("file.txt", std::ios::binary)), std::istream_iterator<char>());
The code is nice, but very slow. It takes about 2/3 seconds to read the same 1.6 MB file. I understand that it may not be the best way to read a file, but why is it so slow?
Reading a file in a classical way goes like this (I'm talking only about the read function):
- the istream contains a filebuf which contains a block of data from the file
- the read function calls sgetn from the filebuf, which copies the chars one by one (no memcpy) from the inside buffer to "data"'s buffer
- when the data inside of the filebuf is entirely read, the filebuf reads the next block from the file
When you read a file using istream_iterator, it goes like this:
- the vector calls *iterator to get the next char (this simply reads a variable), adds it to the end and increases its own size
- if the vector's allocated space is full (which happens not so often), a relocation is performed
- then it calls ++iterator which reads the next char from the stream (operator >> with a char parameter, which certainly just calls the filebuf's sbumpc function)
- finally it compares the iterator with the end iterator, which is done by comparing two pointers
I must admit that the second way is not very efficient, but it's at least 200 times slower than the first way, how is that possible?
I thought that the performance killer was the relocations or the insert, but I tried creating an entire vector and calling std::copy, and it's just as slow.
// also very slow:
std::vector<char> data2(1730608);
std::copy(std::istream_iterator<char>(std::ifstream("file.txt", std::ios::binary)), std::istream_iterator<char>(), data2.begin());