tags:

views:

191

answers:

1

Someone added to the Wikipedia "ptrace" article claiming that, on Linux, a ptraced process couldn't itself ptrace another process. I'm trying to determine if (and if so why) that's the case. Below is a simple program I contrived to test this. My program fails (the sub sub process doesn't run properly) but I'm pretty convinced it's my error and not something fundamental.

In essence the initial process A forks process B which in turn forks C. A ptraces its child B, B ptraces its child C. Once they're set up, all three processes are written to just print A,B, or C to stdout once every second.

In practice what happens is that A and B work fine, but C prints only once and then gets stuck. Checking with ps -eo pid,cmd,wchan shows C stuck in kernel function ptrace_stop while the rest are in hrtimer_nanosleep where I'd expect all three to be.

Very occasionally all three do work (so the program prints Cs as well as As and Bs), which leads me to believe there's some race condition in the initial setup.

My guesses as to what might be wrong are:

  • something to do with A seeing a SIGCHLD related to B seeing a SIGCHLD to do with a signal to C, and wait(2) reporting both as coming from B (but a hacky call of PTRACE_CONT to both pids doesn't fix things)?
  • C should be ptraced by B - has C inherited the ptrace by A instead (and B's call to ptrace neither errored nor overwrote this)?

Can anyone figure out what I'm doing wrong? Thanks.

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>

static void a(){
  while(1){
    printf ("A\n");
    fflush(stdout);
    sleep(1);
  }
}

static void b(){
  while(1){
    printf ("B\n");
    fflush(stdout);
    sleep(1);
  }
}

static void c(){
  while(1){
    printf ("C\n");
    fflush(stdout);
    sleep(1);
  }
}

static void sigchld_handler(int sig){
  int result;
  pid_t child_pid = wait(NULL); // find who send us this SIGCHLD

  printf("SIGCHLD on %d\n", child_pid);
  result=ptrace(PTRACE_CONT, child_pid, sig, NULL);
  if(result) {
    perror("continuing after SIGCHLD");
  }
}

int main(int  argc,
         char **argv){

  pid_t mychild_pid;
  int   result;

  printf("pidA = %d\n", getpid());

  signal(SIGCHLD, sigchld_handler);

  mychild_pid = fork();

  if (mychild_pid) {
    printf("pidB = %d\n", mychild_pid);
    result = ptrace(PTRACE_ATTACH, mychild_pid, NULL, NULL);
    if(result==-1){
      perror("outer ptrace");
    }
    a();
  }
  else {
    mychild_pid = fork();

    if (mychild_pid) {
      printf("pidC = %d\n", mychild_pid);

      result = ptrace(PTRACE_ATTACH, mychild_pid, NULL, NULL);
      if(result==-1){
        perror("inner ptrace");
      }
      b();
    }
    else {
      c();
    }
  }

  return 0;
}
+1  A: 

You are indeed seeing a race condition. You can cause it to happen repeatably by putting sleep(1); immediately before the second fork() call.

The race condition is caused because process A is not correctly passing signals on to process B. That means that if process B starts tracing process C after process A has started tracing process B, process B never gets the SIGCHLD signal indicating that process C has stopped, so it can never continue it.

To fix the problem, you just need to fix your SIGCHLD handler:

static void sigchld_handler(int sig){
    int result, status;
    pid_t child_pid = wait(&status); // find who send us this SIGCHLD

    printf("%d received SIGCHLD on %d\n", getpid(), child_pid);
    if (WIFSTOPPED(status))
    {
        result=ptrace(PTRACE_CONT, child_pid, 0, WSTOPSIG(status));
        if(result) {
            perror("continuing after SIGCHLD");
        }
    }
}
caf
Yes, that's it exactly. Thanks very much for your lovely clear explanation.
Finlay McWalter