I am using mmap call to read from a very big file using simple pointer arithmetic in C++. The problem is that when I read small chunks of data (in the order of KBs) multiple times, each read take the same amount of time as the previous one. How can I know if the disk is being accessed to fulfill my request or whether the request is being fulfilled from main memory (page cache) in calls after the first one.
You will get the best cache performance if you exploit locality of reference. That is to say that if you access variables that are close together in memory (e.g. stepping by one in increasing order through the variables) and you perform these accesses close in time (i.e. not performing many other memory accesses between reading these elements), then you will get the best cache performance. If each read is taking roughly the same amount of time, then it is very likely being cached; if things are not being served from cache, that is usually indicated by several fast reads (cache hits) followed by a spike (cache miss) followed by more fast reads. On almost all systems, a cache miss causes a chunk in which the data resides to be loaded into the cache, so if you access nearby variables (which are in the same chunk) they will be in the cache.
The issue is the following: both reads were being performed from cache. I guess caching starts when the file is opened or mmapped, before asking for the data. To verify this, I issued:
echo 3 > /proc/sys/vm/drop_caches
which flushes out the cache, then, if I run two iterations for getting the same data, the first run is (in my case) 10 times slower than the second.