I am busy writing something to test the read speeds for disk IO on Linux.
At the moment I have something like this to read the files:
Edited to change code to this:
const int segsize = 1048576;
char buffer[segsize];
ifstream file;
file.open(sFile.c_str());
while(file.readsome(buffer,segsize)) {}
For foo.dat, which is 150GB, the first time I read it in, it takes around 2 minutes. However if I run it within 60 seconds of the first run, it will then take around 3 seconds to run. How is that possible? Surely the only place that could be read from that fast is the buffer cache in RAM, and the file is too big to fit in RAM.
The machine has 50GB of ram, and the drive is a NFS mount with all the default settings. Please let me know where I could look to confirm that this file is actually being read at this speed? Is my code wrong? It appears to take a correct amount of time the first time the file is read.
Edited to Add: Found out that my files were only reading up to a random point. I've managed to fix this by changing segsize down to 1024 from 1048576. I have no idea why changing this allows the ifstream to read the whole file instead of stopping at a random point.
Thanks for the answers.