tags:

views:

186

answers:

1

I'm playing with the socket layer in my cross platform framework and I'm trying to get the connect's to work in a non-blocking fashion. However after pouring over the docs they just don't seem to be behaving correctly at all. The heart of the problem is that after initiating a non-blocking connect the following select failes to notice the connect has succeeded and continues to timeout over and over again.

SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
hostent *h = gethostbyname("www.memecode.com");
if (h)
{
 sockaddr_in addr;
 memset(&addr, 0, sizeof(addr));
 addr.sin_family = AF_INET;
 addr.sin_port = htons(80);

 if (h->h_addr_list && h->h_addr_list[0])
 {
  memcpy(&addr.sin_addr, h->h_addr_list[0], sizeof(in_addr));

  // Set non blocking
  fcntl(s, F_SETFL, O_NONBLOCK);

  int64 start = LgiCurrentTime();
  int status = connect(s, (sockaddr*) &addr, sizeof(sockaddr_in));
  printf("Initial connect = %i\n", status);
  while (status && (LgiCurrentTime()-start) < 15000)
  {
   //  Do select to wait for connect to finish
   fd_set wr;
   FD_ZERO(&wr);
   FD_SET(s, &wr);
   int TimeoutMs = 1000;
   struct timeval t = {TimeoutMs / 1000, (TimeoutMs % 1000) * 1000};
   errno = 0;
   int64 sel_start = LgiCurrentTime();
   int ret = select(0, 0, &wr, 0, &t);
   int64 sel_end = LgiCurrentTime();
   printf("%i = select(0,%i,0) errno=%i time=%i\n",
    ret,
    FD_ISSET(s, &wr)!=0,
    errno,
    (int)(sel_end-sel_start));

   if (ret > 0 && FD_ISSET(s, &wr))
   {
    // ready for connect to finish...
    status = connect(s, (sockaddr*) &addr, sizeof(sockaddr_in));
    printf("2nd connect = %i\n", status);
    if (status)
    {
     if (errno == EISCONN)
     {
      status = 0;
      printf("error = EISCONN so we're good.\n");
     }
    }
   }
   // else still waiting...
  }
 }
 else printf("host addr error.\n");
}
else printf("gethostbyname failed.\n");

When I run this code on the latest Leopard build of MacOSX I get this output:

Initial connect = -1
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1001
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000
0 = select(0,1,0) errno=0 time=1000

When I remove the "ret > 0" condition after the select the following connect succeeds and returns EISCONN. So the connect is working behind the scenes but the select never picks up on it. From my understanding of the select return value it contains the number of sockets in the fd_set structures, and returns 0 if none of them have events.

What is wrong with my code?

PS LgiCurrentTime() returns milliseconds since some point, e.g. GetTickCount on windows... I forget the exact implmentation on Mac but it's not important... just timing info.

+1  A: 

I believe the first parameter to select(..) should be the highest filedescriptor number +1 and not 0 making it

int ret = select(s+1, 0, &wr, 0, &t);
epatel
Huh... ok then... quickest answer ever! I'm a refugee from windows and on that platform the first parameter is ignored completely. Thx for the quick reply.
fret
Ha ;) it was fun to see "meme" in the question too. My company is named Memention - http://memention.com
epatel