tags:

views:

2008

answers:

4

I have several questions regarding the mmap implementation in Linux systems which don't seem to be very much documented:

When mapping a file to memory using mmap, how would you handle prefetching the data in such file?

I.e. what happens when you read data from the mmaped region? Is that data moved to the L1/L2 caches? Is it read directly from disk cache? Does the prefetchnta and similar ASM instructions work on mmaped zones?

What's the overhead of the actual mmap call? Is it relative to the amount of mapped data, or constant?

Hope somebody has some insight into this. Thanks in advance.

+12  A: 

mmap is basically programmatic access to the Virtual Memory sub system.

When you have, say, 1G file, and you mmap it, you get a pointer to "the entire" file as if it were in memory.

However, at this stage nothing has happened save the actual mapping operation of reserving pages for the file in the VM. (The large the file, the longer the mapping operation, of course.)

In order to start reading data from the file, you simply access it through the pointer you were returned in the mmap call.

If you wish to "preload" parts of the file, just visit the area you'd like to preload. Make sure you visit ALL of the pages you want to load, since the VM will only load the pages you access. For example, say within your 1G file, you have a 10MB "index" area that you'd like to map in. The simplest way would be to just "walk your index", or whatever data structure you have, letting the VM page in data as necessary. Or, if you "know" that it's the "first 10MB" of the file, and that your page size for your VM is, say, 4K, then you can just cast the mmap pointer to a char pointer, and just iterate through the pages.

void load_mmap(char *mmapPtr) {
    // We'll load 10MB of data from mmap
    int offset = 0;
    for(int offset = 0; offset < 10 * 1024 * 1024; offset += 4 * 1024) {
        char *p = mmapPtr + offset;
        // deref pointer to force mmap load
        char c = *p;
    }
}

As for L1 and L2 caches, mmap has nothing to do with that, that's all about how you access the data.

Since you're using the underlying VM system, anything that addresses data within the mmap'd block will work (ever from assembly).

If you don't change any of the mmap'd data, the VM will automatically flush out old pages as new pages are needed If you actually do change them, then the VM will write those pages back for you.

Will Hartung
Wouldn't char "c = *p" be optimized away? Should c be declared volatile?
Laurynas Biveinis
+1  A: 

It's nothing to do with the CPU caches; it maps it into virtual address space, and if it's subsequently accessed, or locked with mlock(), then it brings it physically into memory. What CPU caches it's in or not in is nothing you really have control over (at least, not via mmap).

Normally touching the pages is necessary to cause it to be mapped in, but if you do a mlock or mlockall, that would have the same effect (these are usually privileged).

As far as the overhead is concerned, I don't really know, you'd have to measure it. My guess is that a mmap() which doesn't load pages in is more or less a constant time operation, but bringing the pages in will take longer with more pages.

Recent versions of Linux also support a flag MAP_POPULATE which instructs mmap to load the pages in immediately (presumably only if possible)

MarkR
A: 

One more question regarding mmap(). I would like to share memory mapped file between two different processes ? How to do?

just pass the same file descriptor to mmap and you have a shared memory block.
Jiri
+2  A: 

Answering Mr. Ravi Phulsundar's question:

Multiple processes can map the same file as long as the permissions are set correctly. Looking at the mmap man page just pass the MAP_SHARED flag ( if you need to map a really large file use mmap2 instead ):

mmap

MAP_SHARED

Share this mapping with all other processes that map this object. Storing to the region is equivalent to writing to the file. The file may not actually be updated until msync(2) or munmap(2) are called.

Robert S. Barnes