views:

920

answers:

6

Hello, I've wanted to test if with multiply processes I'm able to use more than 4GB of ram on 32bit O.S (mine: Ubuntu with 1GB ram).

So I've written a small program that mallocs slightly less then 1GB, and do some action on that array, and ran 5 instances of this program vie forks.

The thing is, that I suspect that O.S killed 4 of them, and only one survived and displayed it's "PID: I've finished").

(I've tried it with small arrays and got 5 printing, also when I look at the running processes with TOP, I see only one instance..)

The weird thing is this - I've received return code 0 (success?) in ALL of the instances, including the ones that were allegedly killed by O.S.

I didn't get any massage stating that processes were killed.

Is this return code normal for this situation?

(If so, it reduces my trust in 'return codes'...)

thanks.

Edit: some of the answers suggested possible errors in the small program, so here it is. the larger program that forks and saves return codes is larger, and I have trouble uploading it here, but I think (and hope) it's fine.

Also I've noticed that if instead of running it with my forking program, I run it with terminal using './a.out & ./a.out & ./a.out & ./a.out &' (when ./a.out is the binary of the small program attached) I do see some 'Killed' messages.

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>
#define SMALL_SIZE 10000
#define BIG_SIZE 1000000000
#define SIZE BIG_SIZE
#define REAPETS 1

    int
main()
{
    pid_t my_pid = getpid();

    char * x = malloc(SIZE*sizeof(char));

    if (x == NULL)
    {
            printf("Malloc failed!");
            return(EXIT_FAILURE);
    }

    int x2=0;
    for(x2=0;x2<REAPETS;++x2)
    {
            int y;
            for(y=0;y<SIZE;++y)
                    x[y] = (y+my_pid)%256;
    }
    printf("%d: I'm over.\n",my_pid);
    return(EXIT_SUCCESS);
}
+1  A: 

What signal was used to kill the processes?

Exit codes between 0 and 127, inclusive, can be used freely, and codes above 128 indicate that the process was terminated by a signal, where the exit code is

128 + the number of the signal used

Mihai Limbășan
Hasturkun
... wish I could upvote comments :)
Mihai Limbășan
+5  A: 

Well, if your process is unable to malloc() the 1GB of memory, the OS will not kill the process. All that happens is that malloc() returns NULL. So depending on how you wrote your code, it's possible that the process could return 0 anyway - if you wanted it to return an error code when a memory allocation fails (which is generally good practice), you'd have to program that behavior into it.

David Zaslavsky
Not entirely true. Linux will let processes over-allocate memory, then kill processes when too much memory is actually in use.
Andrew Medico
Interesting, this is the first I'm hearing that...
David Zaslavsky
But that is still speculation, the only way to know is to use one of the wait system calls like Hasturkun mentioned and check the status as well as the return code.
lothar
Thanks, but I do check for NULL, and print a nice message, which isn't printed.
Liran Orevi
Uploaded the small program if it's of interest.
Liran Orevi
+4  A: 

A process' return status (as returned by wait, waitpid and system) contains more or less the following:

  • Exit code, only applies if process terminated normally
  • whether normal/abnormal termination occured
  • Termination signal, only applies if process was terminated by a signal

The exit code is utterly meaningless if your process was killed by the OOM killer (which will apparently send you a SIGKILL signal)

for more information, see the man page for the wait command.

Hasturkun
good answer, you may want to add a code sample to make it even better :-)
lothar
+1  A: 

Have you checked the return value from fork()? There's a good chance that if fork() can't allocate enough memory for the new process' address space, then it will return an error (-1). A typical way to call fork() is:

pid_t pid;
switch(pid = fork())
{
case 0:
    // I'm the child process
    break;
case -1:
    // Error -- check errno
    fprintf(stderr, "fork: %s\n", strerror(errno));
    break;
default:
    // I'm the parent process
}
Adam Rosenfield
Thanks, in my case however, I do check for it.
Liran Orevi
+1  A: 

Exit code is only "valid" when WIFEXITED macro evaluates to true. See man waitpid(2).

You can use WIFSIGNALED macro to see if your program has been signaled.

phjr
+1  A: 

This code shows how to get the termination status of a child:

#include <stdio.h>
#include <stdlib.h>

#include <unistd.h>
#include <sys/wait.h>

int
main (void)
{ 
  pid_t pid = fork();

  if (pid == -1)
  {
    perror("fork()");
  }
  /* parent */
  else if (pid > 0)
  {
    int status;

    printf("Child has pid %ld\n", (long)pid);

    if (wait(&status) == -1)
    {
      perror("wait()");
    }
    else
    {
      /* did the child terminate normally? */
      if(WIFEXITED(status))
      {
        printf("%ld exited with return code %d\n",
               (long)pid, WEXITSTATUS(status));
      }
      /* was the child terminated by a signal? */
      else if (WIFSIGNALED(status))
      {
        printf("%ld terminated because it didn't catch signal number %d\n",
               (long)pid, WTERMSIG(status));
      }
    }
  }
  /* child */
  else
  {
    sleep(10);
    exit(0);
  }

  return 0;
}
Bastien Léonard