views:

73

answers:

3

I'm working on an application that contains several server sockets that each run in a unique thread.
An external utility (script) is called by one of the threads. This script calls a utility (client) that sends a message to one of the server sockets.

Initially, I was using system() to execute this external script, but we couldn't use that because we had to make sure the server sockets were closed in the child that was forked to execute the external script.
I now call fork() and execvp() myself. I fork() and then in the child I close all the server sockets and then call execvp() to execute the script.

Now, all of that works fine. The problem is that at times the script reports errors to the server app. The script sends these errors by calling another application (client) which opens a TCP socket and sends the appropriate data. My issue is that the client app gets a value of 0 returned by the socket() system call.

NOTE: This ONLY occurs when the script/client app is called using my forkExec() function. If the script/client app is called manually the socket() call performs appropriately and things work fine.

Based on that information I suspect it's something in my fork() execvp() code below... Any ideas?

void forkExec()
{    
    int stat;

    stat = fork();
    if (stat < 0)
    {
        printf("Error forking child: %s", strerror(errno));
    }
    else if (stat == 0)
    {
        char *progArgs[3];

        /*
         * First, close the file descriptors that the child 
         * shouldn't keep open
         */
        close(ServerFd);
        close(XMLSocket);
        close(ClientFd);
        close(EventSocket);
        close(monitorSocket);

        /* build the arguments for script */
        progArgs[0] = calloc(1, strlen("/path_to_script")+1);
        strcpy(progArgs[0], "/path_to_script");
        progArgs[1] = calloc(1, strlen(arg)+1);
        strcpy(progArgs[1], arg);
        progArgs[2] = NULL; /* Array of args must be NULL terminated for execvp() */

        /* launch the script */
        stat = execvp(progArgs[0], progArgs);
        if (stat != 0)
        {
            printf("Error executing script: '%s' '%s' : %s", progArgs[0], progArgs[1], strerror(errno));
        }
        free(progArgs[0]);
        free(progArgs[1]);
        exit(0);
    }

    return;
}

Client app code:

static int connectToServer(void)
{
int socketFD = 0;
int status;
struct sockaddr_in address;
struct hostent* hostAddr = gethostbyname("localhost");

socketFD = socket(PF_INET, SOCK_STREAM, 0);

The above call returns 0.

if (socketFD < 0)
{
    fprintf(stderr, "%s-%d: Failed to create socket: %s", 
                                __func__, __LINE__, strerror(errno));
    return (-1);
}

memset(&address, 0, sizeof(struct sockaddr));
address.sin_family = AF_INET;
memcpy(&(address.sin_addr.s_addr), hostAddr->h_addr, hostAddr->h_length);
address.sin_port = htons(POLLING_SERVER_PORT);

status = connect(socketFD, (struct sockaddr *)&address, sizeof(address));
if (status < 0)
{
    if (errno != ECONNREFUSED)
    {
        fprintf(stderr, "%s-%d: Failed to connect to server socket: %s",
                   __func__, __LINE__, strerror(errno));
    }
    else
    {
        fprintf(stderr, "%s-%d: Server not yet available...%s",
                   __func__, __LINE__, strerror(errno));
        close(socketFD);
        socketFD = 0;
    }
}

return socketFD;
}

FYI
OS: Linux
Arch: ARM32
Kernel: 2.6.26

+1  A: 

Don't forget a call to

waitpid()

End of "obvious question mode". I'm assuming a bit here but you're not doing anything with the pid returned by the fork() call. (-:

Rob Wells
Yeah, I already have a SIGCHLD handler registered that performs waitpid() to cleanup zombies. Thanks anyway :)
Steve Lazaridis
+4  A: 

socket() returns -1 on error.

A return of 0 means socket() succeeded and gave you file descriptor 0. I suspect that one of the file descriptors that you close has file descriptor 0 and once it's closed the next call to a function that allocated a file descriptor will return fd 0 as it's available.

R Samuel Klatchko
Yeah, that's what I suspected. But no, I haven't done anything like that.
Steve Lazaridis
It is returning 0 (as your question states) or is it returning a negative value (like your code sample checks for)? To reiterate, a return of 0 is completely legal if fd 0 is available.
R Samuel Klatchko
It's returning zero. I understand that 0 is a valid fd. The issue here must have something to do with my forkExec() function because if I change to use the system() socket returns > 0. Unfortunately, I cannot to use the system() call.
Steve Lazaridis
system uses the shell to run the new program. I'm guessing the shell is opening fd 0 (which is then inherited by your sub-program) so it's no longer available when socket() is called.
R Samuel Klatchko
The system() call runs /bin/sh -c , but neither the system() call nor /bin/sh will close the file descriptors like you do in forkExec()
nos
Correct! One of the fds that I was closing was set to zero, it wasn't initialized yet. Thanks!
Steve Lazaridis
+1  A: 

A socket with value 0 is fine, it means stdin was closed which will make fd 0 available for reuse - such as by a socket.

chances are one of the filedescriptors you close in the forkExec() child path(XMLSocket/ServerFd) etc.) was fd 0 . That'll start the child with fd 0 closed, which won't happen when you run the app from a command line, as fd 0 will be already open as the stdin of the shell.

If you want your socket to not be 0,1 or 2 (stdin/out/err) call the following in your forkExec() function after all the close() calls

void reserve_tty()
{
  int fd;

  for(fd=0; fd < 3; fd++)
    int nfd;
    nfd = open("/dev/null", O_RDWR);

    if(nfd<0) /* We're screwed. */
    continue;

    if(nfd==fd)
    continue;

    dup2(nfd, fd);
    if(nfd > 2)
     close(nfd);

}

Check for socket returning -1 which means an error occured.

nos
But WHY is stdin being closed? Why would one want to do this intentionally?
AJ
It's common(though often they're redirected to /dev/null as shown in the code above now) and good practice to close all fd's not needed in a background/daemon process, which I'd guess the parent process is.
nos