tags:

views:

212

answers:

3

I have a bunch of mini-server processes running. They're in the same process group as a FastCGI server I need to stop. The FastCGI server will kill everything in its process group, but I need those mini-servers to keep running.

Can I change the process group of a running, non-child process (they're children of PID 1)? setpgid() fails with "No such process" though I'm positive its there.

This is on Fedora Core 10.

NOTE the processes are already running. New servers do setsid(). These are some servers spawned by older code which did not.

+2  A: 

One thing you could try is to do setsid() in the miniservers. That will make them session and process group leaders.

Also, keep in mind that you can't change the process group id to one from another session, and that you have to do the call to change the process group either from within the process that you want to change the group of, or from the parent of the process.

I've recently written some test code to periodically change the process group of a set of processes for a very similar task. You need not change the group id periodically, it's just that I thought I might evade a certain script that periodically checked for a group that runs for longer than a certain amount of time. It may also help you track down the error that you get with setpgid():

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>

void err(const char *msg);
void prn(const char *msg);
void mydaemon();

int main(int arc, char *argv[]) {

    mydaemon();
    if (setsid() < 0)
        err("setsid");

    int secs = 5*60;

    /* creating a pipe for the group leader to send changed
       group ids to the child */
    int pidx[2];
    if (pipe(pidx))
        err("pipe");

    fcntl(pidx[0], F_SETFL, O_NONBLOCK);
    fcntl(pidx[1], F_SETFL, O_NONBLOCK);

    prn("begin");

    /* here the child forks, it's a stand in for the set of
       processes that need to have their group ids changed */
    int child = fork();
    switch (child) {
    case -1: err("fork3");
    case  0:
        close(pidx[1]);

        while(1) {
            sleep(7);
            secs -= 7;
            if (secs <= 0) { prn("end child"); exit(0); }

            int pid;

            /* read new pid if available */
            if (read(pidx[0], &pid, sizeof pid) != sizeof pid) continue;

            /* set new process group id */
            if (setpgid(getpid(), pid)) err("setpgid2");

            prn("child group changed");
        }
    default: break;
    }

    close(pidx[0]);

    /* here the group leader is forked every 20 seconds so that
       a new process group can be sent to the child via the pipe */
    while (1) {
        sleep(20);

        secs -= 20;

        int pid = fork();
        switch (pid) {
        case -1: err("fork2");
        case  0:
            pid = getpid();

            /* set process group leader for this process */
            if (setpgid(pid, pid)) err("setpgid1");

            /* inform child of change */
            if (write(pidx[1], &pid, sizeof pid) != sizeof pid) err("write");

            prn("group leader changed");
            break;
        default:
            close(pidx[1]);
            _exit(0);
        }

        if (secs <= 0) { prn("end leader"); exit(0); }
    }
}

void prn(const char *msg) {
    char buf[256];
    strcpy(buf, msg);
    strcat(buf, "\n");
    write(2, buf, strlen(buf));
}

void err(const char *msg) {
    char buf[256];
    strcpy(buf, msg);
    strcat(buf, ": ");
    strcat(buf, strerror(errno));
    prn(buf);
    exit(1);
}

void mydaemon() {
    int pid = fork();
    switch (pid) {
      case -1: err("fork");
      case  0: break;
      default: _exit(0);
    }

    close(0);
    close(1);
    /* close(2); let's keep stderr */
}
Inshallah
A: 

It sounds like you actually want to daemonise the process rather than move process groups. (Note: one can move process groups, but I believe you need to be in the same session and the target needs to already be a process group.)

But first, see if daemonising works for you:

#include <unistd.h>
#include <stdio.h>

int main() {
  if (fork() == 0) {
    setsid();
    if (fork() == 0) {
      printf("I'm still running! pid:%d", getpid());
      sleep(10);
    }
    _exit(0);
  }

  return 0;
}

Obviously you should actually check for errors and such in real code, but the above should work.

The inner process will continue running even when the main process exits. Looking at the status of the inner process from /proc we find that it is, indeed, a child of init:

Name:   a.out
State:  S (sleeping)
Tgid:   21513
Pid:    21513
PPid:   1
TracerPid:      0
agl
+1  A: 

After some research I figured it out. Inshalla got the essential problem, "you can't change the process group id to one from another session" which explains why my setpgid() was failing (with a misleading message). However, it seems you can change it from any other process in the group (not necessarily the parent).

Since these processes were started by a FastCGI server and that FastCGI server was still running and in the same process group. Thus the problem, can't restart the FastCGI server without killing the servers it spawned. I wrote a new CGI program which did a setpgid() on the running servers, executed it through a web request and problem solved!

Schwern
Error values in setpgid(2): `ESRCH ... For setpgid([pid, pgid]): pid is not the calling process and not a child of the calling process.` Interesting to learn that that you can set in from within the group anyway.
Inshallah
Yeah, I'm getting ESRCH which is poorly stringified in this instance as "No such process".
Schwern