views:

178

answers:

1

I'm looking at some memory mapped files in Java. Let's say I have a heap size set to 2gb, and I memory map a file that is 50gb - far more than the physical memory on the machine. The OS will cache parts of that 50gb file in the os file cache, the java process will have 2gb of heap space. What I'm curious about is how does the OS decide how much of the 50gb file to cache?

For instance, if I have another java process, also with a 2gb heap size, will that 2gb be swapped out to allow the os to cache parts of the memory mapped file? Will parts of the heap space of the first process be swapped out to allow the OS to cache?

Is there any way to tell the OS not to swap heap space for OS caching? If the OS doesn't swap out main processes, how does it determine how big its file cache should be?

A: 

Linux doesn't really distinguish between anonymous and memory-mapped pages. They all get demand-loaded anyway via page faults.

You can think of anonymous memory as if it was a private memory mapping of /dev/zero.

So you can map as much of anything as you want (address-space permititng, but I assume you are on a 64-bit box here). Linux only loads them in when the process touches them, via page faults.

Likewise, it keeps some records of how recently pages have been used so they get prioritised for being thrown away.

If your file mapping is a MAP_SHARED one, the only difference is that pages which are discarded to get more space for other things don't have to be written into a swap area, they can just be read back in from the original file.

So in answer to your question, no, mapping a large file will not take virtual memory away from anyone, provided you don't read or write the pages.

MarkR