views:

68

answers:

3

I was just going through the Networking Guide by Beej and am curious about this part of the code (specifically marked with "From here" and "To here"):

// main loop
    for(;;) {
        read_fds = master; // copy it
        if (select(fdmax+1, &read_fds, NULL, NULL, NULL) == -1) {
            perror("select");
            exit(4);
        }

        // run through the existing connections looking for data to read
        for(i = 0; i <= fdmax; i++) {
            if (FD_ISSET(i, &read_fds)) { // we got one!!
                if (i == listener) {
                    // handle new connections
                    addrlen = sizeof remoteaddr;
                    newfd = accept(listener,
                        (struct sockaddr *)&remoteaddr,
                        &addrlen);

                    if (newfd == -1) {
                        perror("accept");
                    } else {
                        FD_SET(newfd, &master); // add to master set
                        if (newfd > fdmax) {    // keep track of the max
                            fdmax = newfd;
                        }
                        printf("selectserver: new connection from %s on "
                            "socket %d\n",
                            inet_ntop(remoteaddr.ss_family,
                                get_in_addr((struct sockaddr*)&remoteaddr),
                                remoteIP, INET6_ADDRSTRLEN),
                            newfd);
                    }
                } else {
                    // handle data from a client
                    //----------------- FROM HERE --------------------------
                    if ((nbytes = recv(i, buf, sizeof buf, 0)) <= 0) {
                        // got error or connection closed by client
                        if (nbytes == 0) {
                            // connection closed
                            printf("selectserver: socket %d hung up\n", i);
                        } else {
                            perror("recv");
                        }
                        close(i); // bye!
                        FD_CLR(i, &master); // remove from master set
                    //----------------- TO HERE ----------------------------
                    } else {
                        // we got some data from a client
                        for(j = 0; j <= fdmax; j++) {
                            // send to everyone!
                            if (FD_ISSET(j, &master)) {
                                // except the listener and ourselves
                                if (j != listener && j != i) {
                                    if (send(j, buf, nbytes, 0) == -1) {
                                        perror("send");
                                    }
                                }
                            }
                        }
                    }
                } // END handle data from client
            } // END got new incoming connection
        } // END looping through file descriptors
    } // END for(;;)--and you thought it would never end!

    return 0;

Now I know that read doesn't always read "everything" that is to be read on a socket and that it sometimes can return only part of it. In that case, wouldn't this code be incorrect? I mean, after one read, the connection is being closed... Instead, aren't we supposed to have some other mechanism in place? If so, what is the right approach here?

+1  A: 

Yes you would keep reading until you got all the data you expected, obviosuly you need someway of knowing how much to expect - which is why http puts the document size first

Martin Beckett
That was a really smart decision indeed so that pretty much necessitates that we put such a line in any protocol we design is it?
Legend
Or you have fixed size messages, or some obvious end of message symbol - it depends on your application. For http/ftp you know the size of the file you are sending, for video conferencing a different approach is best
Martin Beckett
@Legend:no, not really. In a typical case, you read until you get a return <=0, indicating that you've reached the end of the stream or there has been some other kind of error.
Jerry Coffin
Not really. Think of for instance VoIP. It's impossible to know how long the conversation will continue. You just process incoming audio data as it comes off the socket, and close the connection somehow when done.
Matthew Flaschen
@Jerry: Hmm... So if I understand it correctly, then I need to do something like if((nbytes = read(childfd, ...)) < 0) close(childfd) and not do this in any other case. Is this right?
Legend
@Matthew... Would you mind throwing some pointers on "close the connection somehow"? Do you mean something like a timeout or a protocol specific control message?
Legend
@Legend - you don't always want to close the sock after each message. Http does this so that a web server doesn't need to keep a socket open for each of millions of browsers - but for a different point-point application it migth make sense to keep if open and flag the end of a message
Martin Beckett
That was deliberately vague, because this is a tricky issue. :) TCP/IP has a built-in way to close connections, but Wikipedia explains (http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_termination) why a protocol-specific close message is wise. H.245, for instance, has its own way to close channels (http://docwiki.cisco.com/wiki/Cisco_IOS_Voice_Troubleshooting_and_Monitoring_--_H.323-Related_Standards#H.245_Signaling)
Matthew Flaschen
@Legend: more often, you just do something like: `while ((nbytes=read(childfd, ...) > 0) process(bytes); close(fd);` I.e. read and process bytes in the loop, as long as you read successfully. When you exit the loop, you can check for errors or just close the socket.
Jerry Coffin
+4  A: 

The socket is only going to get closed there if there was an error from recv(), otherwise it'll deal with the data that was read even if it isnt all read. It will then read more out when it loops through again. Pretty sure this is what you're asking?

ThePosey
Oh I was asking about how did the code assume that all the data is read in one-go... What happens if it didn't read the whole thing in which there is a explicit close line to close the connection... Then the data is "lost"...
Legend
That close is only happening if theres no data left to read or an error from recv.
ThePosey
Ohh!! My bad... I did not really pay a lot of attention to the else condition there...
Legend
Yeah C socket code is pretty much all loops forever until something bad happens or someone closes the connection.
ThePosey
+1  A: 

Your only calling close when recv() has returned a negative value which means that recv had some sort of error. Notice that the block where you do the close has a comment stating // got error or connection closed by client).

When you actually get some data (the else branch starting with // we got some data from a client), the connection is not being closed.

You are right that you can't assume the data arrives all at one time. Your mistake is in following how the code is working.

R Samuel Klatchko
Yeap just realized I asked the question with a correct piece of code :)
Legend