views:

1405

answers:

8

I'm trying to optimize handling of large datasets using mmap. A dataset is in the gigabyte range. The idea was to mmap the whole file into memory, allowing multiple processes to work on the dataset concurrently (read-only). It isn't working as expected though.

As a simple test I simply mmap the file (using perl's Sys::Mmap module, using the "mmap" sub which I believe maps directly to the underlying C function) and have the process sleep. When doing this, the code spends more than a minute before it returns from the mmap call, despite this test doing nothing - not even a read - from the mmap'ed file.

Guessing, I though maybe linux required the whole file to be read when first mmap'ed, so after the file had been mapped in the first process (while it was sleeping), I invoked a simple test in another process which tried to read the first few megabytes of the file.

Suprisingly, it seems the second process also spends a lot of time before returning from the mmap call, about the same time as mmap'ing the file the first time.

I've made sure that MAP_SHARED is being used and that the process that mapped the file the first time is still active (that it has not terminated, and that the mmap hasn't been unmapped).

I expected a mmapped file would allow me to give multiple worker processes effective random access to the large file, but if every mmap call requires reading the whole file first, it's a bit harder. I haven't tested using long-running processes to see if access is fast after the first delay, but I expected using MAP_SHARED and another separate process would be sufficient.

My theory was that mmap would return more or less immediately, and that linux would load the blocks more or less on-demand, but the behaviour I am seeing is the opposite, indicating it requires reading through the whole file on each call to mmap.

Any idea what I'm doing wrong, or if I've completely misunderstood how mmap is supposed to work?

A: 

That does sound surprising. Why not try a pure C version?

Or try your code on a different OS/perl version.

Rhythmic Fistman
I've looked at the perl OS interface, and it calls the C version more or less directly, but unless I figure it out I will probably test a C version as well.As for OS/perl version, I've tested on two system, both x86_64. One is Ubuntu 8.04.2 (linux 2.6.24-22, perl 5.8.8) and the other Ubuntu 9.04 (linux 2.6.28-13, perl 5.10.0). Same behaviour. The second system was a laptop, and I can definitively confirm that there is serious disk io involved when mmap is called from my tests.
Marius Kjeldahl
+7  A: 

If you have a relatively recent version of Perl, you shouldn't be using Sys::Mmap. You should be using PerlIO's mmap layer.

Can you post the code you are using?

Chas. Owens
Agree, the PerlIO mmap layer is probably preferrable as it would also allow the same code to run with/without mmap'ing by simply adding/removing the mmap attribute. Regardless, I found the problem, posted the code, problem solved.
Marius Kjeldahl
Make that problem solved up to 2GB. For larger files perl still has problems, see my other answer related to this.
Marius Kjeldahl
+10  A: 

Ok, found the problem. As suspected, neither linux or perl were to blame. To open and access the file I do something like this:

#!/usr/bin/perl
# Create 1 GB file if you do not have one:
# dd if=/dev/urandom of=test.bin bs=1048576 count=1000
use strict; use warnings;
use Sys::Mmap;

open (my $fh, "<test.bin")
    || die "open: $!";

my $t = time;
print STDERR "mmapping.. ";
mmap (my $mh, 0, PROT_READ, MAP_SHARED, $fh)
    || die "mmap: $!";
my $str = unpack ("A1024", substr ($mh, 0, 1024));
print STDERR " ", time-$t, " seconds\nsleeping..";

sleep (60*60);

If you test that code, there are no delays like those I found in my original code, and after creating the minimal sample (always do that, right!) the reason suddenly became obvious.

The error was that I in my code treated the $mh scalar as a handle, something which is light weight and can be moved around easily (read: pass by value). Turns out, it's actually a GB long string, definitively not something you want to move around without creating an explicit reference (perl lingua for a "pointer"/handle value). So if you need to store in in a hash or similar, make sure you store \$mh, and deref it when you need to use it like ${$hash->{mh}}, typically as the first parameter in a substr or similar.

Marius Kjeldahl
+1 for following up with a detailed explanation.
RichieHindle
Use 3 arg form of open().
Brad Gilbert
+2  A: 

On 32-bit systems the address space for mmap()s is rather limited (and varies from OS to OS). Be aware of that if you're using multi-gigabyte files and your are only testing on a 64-bit system. (I would have preferred to write this in a comment but I don't have enough reputation points yet)

knweiss
+1. Looks like a valid answer which addresses the asked question to me, so thank you for not posting it as a comment.
Dave Sherohman
As I've posted in my other answer, even on 64 bit systems, there's still problems for larger files (>2GB). Your answer is correct though. I'm already 64 bit on all my machines, even the laptop, so it's not an issue for me.
Marius Kjeldahl
A: 

See Wide Finder for perl performance with mmap. But there is one big pitfall. If your dataset will be on classical HD and you will read from multiple processes, you can easily fall in random access and your IO will fall down to unacceptable values (20~40 times).

Hynek -Pichi- Vychodil
What I am trying to do is random access by design from multiple processes, making sure only the parts of the file most often accessed remains in memory.at all times. What pattern would you suggest if random access from multiple processes and a huge file is required?
Marius Kjeldahl
If you *really* need random access to huge file, there is not better solution.
Hynek -Pichi- Vychodil
+1  A: 

one thing that can help performance is the use of 'madvise(2)'. probably most easily done via Inline::C. 'madvise' lets you tell the kernel what your access pattern will be like (e.g. sequential, random, etc).

A: 

Ok, here's another update. Using Sys::Mmap or PerlIO's ":mmap" attribute both works fine in perl, but only up to 2 GB files (the magic 32 bit limit). Once the file is more than 2 GB, the following problems appear:

Using Sys::Mmap and substr for accessing the file, it seems that substr only accepts a 32 bit int for the position parameter, even on systems where perl supports 64 bit. There's at least one bug posted about it:

#62646: Maximum string length with substr

Using open(my $fh, "<:mmap", "bigfile.bin"), once the file is larger than 2 GB, it seems perl will either hang/or insist on reading the whole file on the first read (not sure which, I never ran it long enough to see if it completed), leading to dead slow performance.

I haven't found any workaround to either of these, and I'm currently stuck with slow file (non mmap'ed) operations for working on these files. Unless I find a workaround I may have to implement the processing in C or another higher level language that supports mmap'ing huge files better.

Marius Kjeldahl
try using mmap from Sys::Mmap directly to create a sliding window in the scalar.
Chas. Owens
Thanks, that's certainly a workaround. It would necessitate keeping track of the pointer into the file and map/unmapping when necessary, which probably affects performance. But it's probably still faster than straight file IO.
Marius Kjeldahl
Did some benchmarking, confirming that dynamically map/unmapping using a segment size of 2 GB, and assuming that segment switches are fairly infrequent, speed is some 30-40% faster using mmap with unmap/mapping than straight file IO on a 3 GB file. On a 2 GB file the differences are less, but I suspect this is due to my laptop caching most of the file during the random accesses anyway. So at least I have a solution that works, although not as cleanly as I would have hoped. No need for further optimization at this stage though.
Marius Kjeldahl
A: 

If I may plug my own module: I'd advice using File::Map instead of Sys::Mmap. It's much easier to use, and is less crash-prone than Sys::Mmap.

Leon Timmermans
Here's a suggestion for a new very useful feature, based on my observation of perl described in this thread (memory mapped files only working up to 2 GB); if the user maps a file larger than 2 GB, use a segmented approach with a "custom" read function that automatically unmaps/maps as necessary. At least until the 2 GB perl "bug" is fixed..
Marius Kjeldahl