If you really want to optimize this you probably want to drop the C++ fstream stuff or at least turn off buffering for it. fstream does a lot of memory allocation and deallocation and buffering could read in more data than is needed. The OS will likely need to read an entire page to get the few bytes you need, but fstream will probably want it to copy at least that much (and maybe more, requiring more reads) to its buffers, which will take time.
Now, we can move on to bigger wins. You probably want to use the OS's IO routines directly. If you are using a POSIX system (like Linux) then open
, lseek
, read
, and close
are a good first go at this, and may be required if you don't have the next system calls.
If all of the files that you are trying to read from live in one directory (folder) or under one then you may find that opening the directory with opendir
or open("directory_name", O_DIRECTORY)
(depending on if you need to read the directory entries yourself) and then calling openat
, which takes a directory entry file descriptor as one of its arguments will speed up opening each file since the OS won't have work as hard to look up the file you're trying to open each time (that data will probably be in the OS's file system cache, but it still takes time and has lots of tests).
Then you may be able to read in your data by using the pread
system call, without having to do any seeking to the location of the data you need. pread
takes in an offset rather than using the OS's idea of the current seek point. This will save you one system call at the very least.
edit
If your system supports asynchronous IO this should speed thing up as you would be able to go ahead and let the OS know what you want up front before you go retrieve it (this lets the OS schedule the disk reads better, especially for rotating disks), but this can get complicated. It would likely save you a lot of time, though.