views:

89

answers:

3

Linux uses COW to keep memory usage low after a fork, but the way Perl 5 variables work in perl seems to defeat this optimization. For instance, for the variable:

my $s = "1";

perl is really storing:

SV = PV(0x100801068) at 0x1008272e8
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x100201d50 "1"\0
  CUR = 1
  LEN = 16

When you use that string in a numeric context, it modifies the C struct representing the data:

SV = PVIV(0x100821610) at 0x1008272e8
  REFCNT = 1
  FLAGS = (IOK,POK,pIOK,pPOK)
  IV = 1
  PV = 0x100201d50 "1"\0
  CUR = 1
  LEN = 16

The string pointer itself did not change (it is still 0x100201d50), but now it is in a different C struct (a PVIV instead of a PV). I did not modify the value at all, but suddenly I am paying a COW cost. Is there any way to lock the perl representation of a Perl 5 variable so that this time saving (perl doesn't have to convert "0" to 0 a second time) hack doesn't hurt my memory usage?

Note, the representations above were generated from this code:

perl -MDevel::Peek -e '$s = "1"; Dump $s; $s + 0; Dump $s'
+3  A: 

The only solution I have found so far, is to make sure I force perl to do all of the conversions I expect in the parent process. And you can see from the code below, even that only helps a little.

Results:

Useless use of addition (+) in void context at z.pl line 34.
Useless use of addition (+) in void context at z.pl line 45.
Useless use of addition (+) in void context at z.pl line 51.
before eating memory
used memory: 71
after eating memory
used memory: 119
after 100 forks that don't reference variable
used memory: 144
after children are reaped
used memory: 93
after 100 forks that touch the variables metadata
used memory: 707
after children are reaped
used memory: 93
after parent has updated the metadata
used memory: 109
after 100 forks that touch the variables metadata
used memory: 443
after children are reaped
used memory: 109

Code:

#!/usr/bin/perl

use strict;
use warnings;

use Parallel::ForkManager;

sub print_mem {
    print @_, "used memory: ", `free -m` =~ m{cache:\s+([0-9]+)}s, "\n";
}

print_mem("before eating memory\n");

my @big = ("1") x (1_024 * 1024);

my $pm = Parallel::ForkManager->new(100);

print_mem("after eating memory\n");

for (1 .. 100) {
    next if $pm->start;
    sleep 2;
    $pm->finish;
}

print_mem("after 100 forks that don't reference variable\n");

$pm->wait_all_children;

print_mem("after children are reaped\n");

for (1 .. 100) {
    next if $pm->start;
    $_ + 0 for @big; #force an update to the metadata
    sleep 2;
    $pm->finish;
}

print_mem("after 100 forks that touch the variables metadata\n");

$pm->wait_all_children;

print_mem("after children are reaped\n");

$_ + 0 for @big; #force an update to the metadata

print_mem("after parent has updated the metadata\n");

for (1 .. 100) {
    next if $pm->start;
    $_ + 0 for @big; #force an update to the metadata
    sleep 2;
    $pm->finish;
}

print_mem("after 100 forks that touch the variables metadata\n");

$pm->wait_all_children;

print_mem("after children are reaped\n");
Chas. Owens
Nice, but do you know what will happen when you allocate few hundred MB of data, fork few children and that children will end? GC will kill you anyway. It's sad story but Perl is just wrong tool for this sort of job. We solved it partially using END {kill 9 $$} approach but at this point you should look for better tool ;-)
Hynek -Pichi- Vychodil
The GC doesn't bother me, the real code is `mod_perl` based and each child is reused many times. The problem is that the config data that is loaded in the parent is being copied into each of the possibly hundreds of children even though the children never modify it (from Perl 5's perspective, `perl` is mucking about with the metadata). The other solution I have considered is moving the config data out into a separate process and let the children talk to it over domain sockets.
Chas. Owens
You can also use shared memory which will be faster.
Hynek -Pichi- Vychodil
@Hynek If the global destruction phase is bothering you you're far better off calling `exec '/bin/true/` or using `POSIX::_exit()`. Both are documented ways to skip object destruction.
Ven'Tatsu
@Ven'Tatsu where is it documented that those skip the GC phase?
Chas. Owens
@Chasin perlfunc under exec "Note that exec will not call your END blocks, nor will it call any DESTROY methods in your objects."in POSIX "It exits the program immediately which means among other things buffered I/O is not flushed." and in perlfunc under exit "The exit() function does not always exit immediately. It calls any defined END routines first, but these END routines may not themselves abort the exit. Likewise any object destructors that need to be called are called before the real exit. If this is a problem, you can call POSIX:_exit($status) to avoid END and destructor processing"
Ven'Tatsu
@Ven'Tatsu thanks.
Chas. Owens
+2  A: 

Anyway if you avoid COW on start and during run you should not forgot END phase of lifetime. In shutdown there are two GC phases when in first there are ref counts updates so it can kill you in nice way. You can in solve it ugly:

END { kill 9, $$ }
Hynek -Pichi- Vychodil
Your answer does not make much sense.
Ether
@Ether at the end of the program all variables will get their `REFCNT` fields decremented by the GC. This will cause a sudden spike in memory usage as all of the variables are pulled into the child process. The code he is suggesting will cause the child to die before the garbage collection phase starts.
Chas. Owens
+2  A: 

This goes without saying, but COW doesn't happen on a per-struct basis, but on a memory page basis. So it's enough that one thing in an entire memory page be modified like this for you to pay the copying cost.

On Linux you can query the page size like this:

getconf PAGESIZE

On my system that's 4096 bytes. You can fit a lot of Perl scalar structs in that space. If one of those things gets modified Linux will have to copy the entire thing.

This is why using memory arenas is a good idea in general. You should separate your mutable and immutable data so that you won't have to pay COW costs for immutable data just because it happened to reside in the same memory page as mutable data.

Ævar Arnfjörð Bjarmason
The problem is that `perl` updates the `struct` even though I am not changing any of the data I care about, so I can't separate my immutable data (because there is no immutable data).
Chas. Owens
Also, there isn't an easy way to separate mutable from immutable data (even if there were any).
Ether