views:

206

answers:

4

In the man page for the system call write(2) -

ssize_t write(int fd, const void *buf, size_t count);

it says the following:

Return Value

On success, the number of bytes written are returned (zero indicates nothing was written). On error, -1 is returned, and errno is set appropriately. If count is zero and the file descriptor refers to a regular file, 0 may be returned, or an error could be detected. For a special file, the results are not portable.

I would interpret this to mean that returning 0 simply means that nothing was written, for whatever arbitrary reason.

However, Stevens in UNP treats a return value of 0 as a fatal error when dealing with a file descriptor that is a TCP socket ( this is wrapped by another function which calls exit(1) on a short count ):

ssize_t /* Write "n" bytes to a descriptor. */
writen(int fd, const void *vptr, size_t n)
{
    size_t      nleft;
    ssize_t     nwritten;
    const char  *ptr;

    ptr = vptr;
    nleft = n;
    while (nleft > 0) {
        if ( (nwritten = write(fd, ptr, nleft)) <= 0) {
            if (nwritten < 0 && errno == EINTR)
                nwritten = 0;       /* and call write() again */
            else
                return(-1);         /* error */
        }

        nleft -= nwritten;
        ptr   += nwritten;
    }
    return(n);
}

He only treats 0 as a legit return value if the errno indicates that the call to write was interrupted by the process receiving a signal.

Why?

A: 

As your man page says, the return value of 0 is "not portable" for special files. Sockets are special files, so the result could mean something different for them.

Usually for sockets, a value of 0 bytes from read() or write() is an indication that the socket has closed, and after receiving 0, subsequent calls will return -1 with an error code.

SoapBox
Writing a closed socket usually causes a `SIGPIPE` IIRC. You're interpreting the man page to say that for special files you effectively don't know what a return value of 0 means, so always treat it as an error? To me "special file" seems to be referring only to the case were you pass a count of zero, although I could be mistaken. Looking at the man page for `send` also seems to indicate that 0 is a legit return value on a socket.
Robert S. Barnes
I think the man page says that passing a count of 0 to a fdbelonging to a special file is non-portable, not that thereturn value 0 is somehow non-portable.
Per Ekman
+1  A: 

Also, and just to be somewhat pedantic here, if you are not writing to a socket, i would check to make sure that the buffer length ("count" in the first example) is actually being calculated correctly. In the Stevens example, you wouldn't even execute the write() call if the buffer length was 0.

bobp
As I mention in the question, I'm specifically talking about an fd which refers to a TCP socket.
Robert S. Barnes
+3  A: 

Stevens probably does this to catch old implementations of write() that behaved differently. For instance, the Single Unix Spec says (http://www.opengroup.org/onlinepubs/000095399/functions/write.html)

Where this volume of IEEE Std 1003.1-2001 requires -1 to be returned and errno set to [EAGAIN], most historical implementations return zero

Per Ekman
So that would seem to indicate a valid state which requires a retry similar to getting an `errno` of `EINTR`. So why would he treat it as an unrecoverable error?
Robert S. Barnes
Good question. In my copy of UNP (2nd edition) writen() checks forEINTR even if nwritten is 0 (page 78). Looks like a bug in edition1?
Per Ekman
@Per Ekman: The code in my question is actually from the 3rd Edition... mark4o's answer sounds interesting...
Robert S. Barnes
@Per Ekman: I looked up the paragraph with that sentence and I think you're correct. I think the reason is that this particular function is meant to only be used with blocking sockets, which should never return a 0 which indicates the call would block. Also, writing 0 bytes to the file descriptor would return 0 but this is also undesirable with a socket since this would not cause an EOF to be written to the descriptor, as only a call to `shutdown` or `close` can do that on a socket. So in this context 0 always indicates unexpected / illegal behavior.
Robert S. Barnes
+1  A: 

This will ensure that the code does not spin indefinitely, even if the file descriptor is not a TCP socket or unexpected non-blocking flags are in effect. On some systems, certain legacy non-blocking modes (e.g. O_NDELAY) cause write() to return 0 (without setting errno) if no data can be written without blocking, at least for certain types of file descriptors. (The POSIX standard O_NONBLOCK uses an error return for this case.) And some of the non-blocking modes on some systems apply to the underlying object (e.g. socket, fifo) rather than the file descriptor, and so could even have been enabled by another process having an open file descriptor for the same object. The code protects itself from spinning in such a situation by simply treating it as an error, since it is not intended for use with non-blocking modes.

mark4o