ansaurus

Question

Why is fseeko() faster with giant files than small ones?

Answer 1

+13 A:

You're not measuring disk performance, you're measuring how long it takes for fseek to set a pointer and return.

I recommend you do a file read from the location you're seeking to, if you want to test real IO.

Carl Smotricz 2010-07-16 17:18:00

Wow... Ok, I added a getc() call after the seek to read a single character. Now, seeking in the large file is just slightly more expensive than seeking in the small file. Is there some optimization where multiple subsequent seeks are summed and actually done before the next IO? Wow...

dicroce 2010-07-16 17:27:47

A seek() is just a hint to an operating system that you plan to read from somewhere next. The OS has a complicated scheduling mechanism to move disk heads in such a way as to minimize total travel time for all users. Since your reads get interleaved with everybody else's, it makes no sense to seek until at the last moment, when the OS (not your program, the OS!) is going to be doing the reading. So the OS keeps your seek position in the back of its mind but doesn't action it until it actually physically reads the data.

Carl Smotricz 2010-07-16 17:38:38

Answer 2

A:

I would assume that it has to do with the implementation of fseeko.

The man page of fseek indicates that it merely "sets the file position indicator for the indicated stream." Since setting an integer should be independent of the file size, perhaps there is an "optimization" that will perform an automatic read (and cache the resulting information) after an fseek for small files and not large files.

advait 2010-07-16 18:01:38

ansaurus

tags:

views:

answers:

Why is fseeko() faster with giant files than small ones?

related questions