views:

242

answers:

7

I need to keep as much as I can of large file in the operating system block cache even though it's bigger than I can fit in ram, and I'm continously reading another very very large file. ATM I'll remove large chunk of large important file from system cache when I stream read form another file.

+1  A: 

Some operating systems have ramdisks that you can use to set aside a segment of ram for storage and then mounting it as a file system.

What I don't understand, though, is why you want to keep the operating system from caching the file. Your full question doesn't really make sense to me.

William Hutchen
Yeah.. I blame the late hour for my unlcear statements. But I've changed the question..
Erik Johansson
+3  A: 

Within linux, you can mount a filesystem as the type tmpfs, which uses available swap memory as backing if needed. You should be able to create a filesystem greater than your memory size and it will prioritize the contents of that filesystem in the system cache.

mount -t tmpfs none /mnt/point

See: http://lxr.linux.no/linux/Documentation/filesystems/tmpfs.txt

You may also benefit from the files swapiness and drop_cache within /proc/sys/vm

Sufian
A: 

Buy more ram (it's relatively cheap!) or let the OS do its thing. I think you'll find that circumventing the OS is going to be more trouble than it's worth. The OS will cache as much of the file as needed, until yours or any other applications needs memory.

I guess you could minimize the number of processes, but it's probably quicker to buy more memory.

basszero
A: 

Sufians answer is good, I'll have to test the swapiness thingy to see if it works though. (if I have enough ram to fit the file, using tmpfs without swap increase the performance alot)

@basszero: The thing is this happens alot for me, I stream a very large file and do calculations on it. I don't want the OS to cache that file, since it will never fit in RAM, and I'm only doing stream reads never going back to read something again..

Erik Johansson
A: 

mlock() and mlockall() respectively lock part or all of the calling process’s virtual address space into RAM, preventing that memory from being paged to the swap area.

(copied from the MLOCK(2) Linux man page)

florin
+3  A: 

If you're using Windows, consider opening the file you're scanning through with the flag

FILE_FLAG_SEQUENTIAL_SCAN

You could also use

FILE_FLAG_NO_BUFFERING

for that file, but it imposes some restrictions on your read size and buffer alignment.

Don Neufeld
+2  A: 

In a POSIX system like Linux or Solaris, try using posix_fadvise.

On the streaming file, do something like this:

posix_fadvise(fd, 0, 0, POSIX_FADV_SEQUENTIAL);
while( bytes > 0 ) {
  bytes = pread(fd, buffer, 64 * 1024, current_pos);
  current_pos += 64 * 1024;
  posix_fadvise(fd, 0, current_pos, POSIX_FADV_DONTNEED);
}

And you can apply POSIX_FADV_WILLNEED to your other file, which should raise its memory priority.

Now, I know that Windows Vista and Server 2008 can also do nifty tricks with memory priorities. Probably older versions like XP can do more basic tricks as well. But I don't know the functions off the top of my head and don't have time to look them up.

Zan Lynx