views:

526

answers:

4

Is there a way to make the OOM killer work and prevent Linux from freezing? I've been running Java and C# applications, where any memory allocated is usually used, and (if I'm understanding them right) overcommits are causing the machine to freeze. Right now, as a temporary solution, I added,

vm.overcommit_memory = 2
vm.overcommit_ratio = 10

to /etc/sysctl.conf.

Kudos to anyone who can explain why the existing OOM killer can't function correctly in a guaranteed manner, killing processes whenever the kernel runs out of "real" memory.

EDIT -- many responses are along the lines of Michael's "if you are experiencing OOM killer related problems, then you probably need to fix whatever is causing you to run out of memory". I don't think this is the correct solution. There will always be apps with bugs, and I'd like to adjust the kernel so my entire system doesn't freeze. Given my current technical understandings, this doesn't seem like it should be impossible.

A: 

If your processes's oom_adj is set to -17 it won't be considered for killing altough I doubt it's the issue here.

cat /proc/<pid>/oom_adj

will tell you the value of your process(es)'s oom_adj.

Webinator
+1  A: 

I'd have to say the best way of preventing OOM freezes is to not run out of virtual memory. If you are regularly running out of virtual memory, or getting close, then you have bigger problems.

Most tasks don't handle failed memory allocations very well so tend to crash or lose data. Running out of virtual memory (with or without overcommit) will cause some allocations to fail. This is usually bad.

Moreover, before your OS runs out of virtual memory, it will start doing bad things like discarding pages from commonly used shared libraries, which is likely to make performance suck as they have to be pulled back in often, which is very bad for throughput.

My suggestions:

  • Get more ram
  • Run fewer processes
  • Make the processes you do run use less memory (This may include fixing memory leaks in them)

And possibly also

  • Set up more swap space

If that is helpful in your use-case.

Most multi-process servers run a configurable (maximum) number of processes, so you can typically tune it downwards. Multithreaded servers typically allow you to configure how much memory to use for their buffers etc internally.

MarkR
+2  A: 

Hi there,

Below is a really basic perl script I wrote. With a bit of tweaking it could be useful. You just need to change the paths I have to the paths of any processes that use Java or C#. You could change the kill commands I've used to restart commands also. Of course to avoid typing in perl memusage.pl manually, you could put it into your crontab file to run automatically. You could also use perl memusage.pl > log.txt to save its output to a log file. Sorry if it doesn't really help, but I was bored while drinking a cup of coffee. :-D Cheers

#!/usr/bin/perl -w
# Checks available memory usage and calculates size in MB
# If free memory is below your minimum level specified, then
# the script will attempt to close the troublesome processes down
# that you specify. If it can't, it will issue a -9 KILL signal.
#
# Uses external commands (cat and pidof)
#
# Cheers, insertable

our $memmin = 50;
our @procs = qw(/usr/bin/firefox /usr/local/sbin/apache2);

sub killProcs
{
    use vars qw(@procs);
    my @pids = ();
    foreach $proc (@procs)
    {
        my $filename=substr($proc, rindex($proc,"/")+1,length($proc)-rindex($proc,"/")-1);
        my $pid = `pidof $filename`;
        chop($pid);
        my @pid = split(/ /,$pid);
        push @pids, $pid[0];
    }
    foreach $pid (@pids)
    {
        #try to kill process normall first
        system("kill -15 " . $pid); 
        print "Killing " . $pid . "\n";
        sleep 1;
        if (-e "/proc/$pid")
        {
            print $pid . " is still alive! Issuing a -9 KILL...\n";
            system("kill -9 " + $pid);
            print "Done.\n";
        } else {
            print "Looks like " . $pid . " is dead\n";
        }
    }
    print "Successfully finished destroying memory-hogging processes!\n";
    exit(0);
}

sub checkMem
{
    use vars qw($memmin);
    my ($free) = $_[0];
    if ($free > $memmin)
    {
        print "Memory usage is OK\n";
        exit(0);
    } else {
        killProcs();
    }
}

sub main
{
    my $meminfo = `cat /proc/meminfo`;
    chop($meminfo);
    my @meminfo = split(/\n/,$meminfo);
    foreach my $line (@meminfo)
    {
        if ($line =~ /^MemFree:\s+(.+)\skB$/)
        {
            my $free = ($1 / 1024);
            &checkMem($free);
        }
    }
}

main();
Not bad, but maybe not so reliable. Maybe hard ulimits would work? I can't seem to get them to though...
gatoatigrado
Sorry, but what did you want to do with hard ulimits? Keep in mind you can only set hard limits as root. There's extra configuration in /etc/security/limits.conf, I believe.
A: 

First off, how can you be sure the freezes are OOM killer related? I've got a network of systems in the field and I get not infrequent freezes, which don't seem to be OOM related (our app is pretty stable in memory usage). Could it be something else? Is there any interesting hardware involved? Any unstable drivers? High performance video?

Even if the OOM killer is involved, and worked, you'd still have problems, because stuff you thought was running is now dead, and who knows what sort of mess it's left behind.

Really, if you are experiencing OOM killer related problems, then you probably need to fix whatever is causing you to run out of memory.

Michael Kohne
once or twice, I've been able to pull up the system monitor before everything freezes.
gatoatigrado