views:

330

answers:

3

I'm been trying to poll from a set of named-pipes for a little while now and i keep getting an immediate response of POLLNVAL on any named pipe file descriptor. After finding this blog post about broken polling in OS X I'm pretty certain that this is a b-u-g bug in OS X.

I'm already planning on switching my code to using UDP sockets, but i wanted to ask SO for verification about this a) so that I'm sure it's really broken, and b) for documentation purposes.

Here is a stripped down version of the code I wrote (although the code in the link above, which I tested, spells it out pretty well):

#includes
...
....
#

static const char* first_fifo_path = "/tmp/fifo1";
static const char* second_fifo_path = "/tmp/fifo2";

int setup_read_fifo(const char* path){
  int fifo_fd = -1;

  if( mkfifo(path, S_IRWXU | S_IRWXG | S_IRWXO) )
    perror("error calling mkfifo()... already exists?\n");

  if((fifo_fd = open(path, O_RDONLY | O_NDELAY)) < 0)
    perror("error calling open()");

  return fifo_fd;
}

void do_poll(int fd1, int fd2){
  char inbuf[1024];
  int num_fds = 2;
  struct pollfd fds[num_fds];
  int timeout_msecs = 500;

  fds[0].fd = fd1;
  fds[1].fd = fd2;
  fds[0].events = POLLIN;
  fds[1].events = POLLIN;

  int ret;
  while((ret = poll(fds, num_fds, timeout_msecs)) >= 0){
    if(ret < 0){
      printf("Error occured when polling\n");
      printf("ret %d, errno %d\n", ret, errno);
      printf("revents =  %xh : %xh \n\n", fds[0].revents, fds[1].revents);
    }

   if(ret == 0){
      printf("Timeout Occurred\n");
      continue;
    }                                                                   

    for(int i = 0; i< num_fds; i++){
      if(int event = fds[i].revents){

        if(event & POLLHUP)
          printf("Pollhup\n");
        if(event & POLLERR)
          printf("POLLERR\n");
        if(event & POLLNVAL)
          printf("POLLNVAL\n");

        if(event & POLLIN){
          read(fds[i].fd, inbuf, sizeof(inbuf));
          printf("Received: %s", inbuf);
        }
      }
    }
  }
}

int main (int argc, char * const argv[]) {
  do_poll(setup_read_fifo(first_fifo_path), setup_read_fifo(second_fifo_path));
  return 0;
}

this outputs:

$ ./executive 
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
POLLNVAL
...

ad nauseam.

Anybody else run into this? This is a real bug right?

Thanks.

+3  A: 

This seems to be a genuine bug. It works as expected on Linux and OpenBSD and fails as you describe on OS X.

dwc
+3  A: 

OSX 10.4.1, I can confirm the behaviour. The same code works fine (as long as timeout messages are fine) on Linux. All the evidence, including this - http://www.virtualbox.de/changeset/12347 - suggests that there is a real problem.

Mart Oruaas
Timeouts are correct behavior with no writer. I wrote a simple writer program for further testing and the program in the Q reads the FIFO fine on Linx/OpenBSD and fails miserably on OS X (same POLLNVAL problem).
dwc
+1  A: 

Yup, known bug. I think the poll breakage is only since 10.4, we had to deal with it in Fink. Glib's configure.in has a test for this, so you can be sure that you're not imagining it. (Well, not precisely this, glib tests for poll on devices, not fifos.)

vasi