tags:

views:

110

answers:

3

I've this weird behaviour in a multithreaded server programmed in C under GNU/Linux. While it's sending data, eventually will be interrupted by SIGPIPE. I managed to ignore signals in send() and treat errno after each action because of it.

So, it has two individual sending methods, one that sends a large amount of data at once (or at least tries to), and another that sends a nearly similar amount and slices it in little chunks. Finally, I tried with this to keep it sending data.

do
{
    total_bytes_sent += send(client_sd, output_buf + total_bytes_sent,
                             output_buf_len - total_bytes_sent, MSG_NOSIGNAL);
}
while ((total_bytes_sent < output_buf_len) && (errno != EPIPE));

This ugly piece of code does its work in certain situations, but not always.

I'm pretty sure it's not a hardware or ISP problem, as this server is running in six european servers, four in Germany and two in France.

Any ideas?

Thanks in advance.

EDIT 1: yep, I noticed that this piece of code is crappy (thanks Jay). What I meant initially is that this code gives me a EPIPE whenever the client cuts off communication or not.

EDIT 2: I tried with a single send() and it gives me the same error randomly. It's weird, because I can't send a large data chunk. I tried enlarging the send buffer, but didn't work.

EDIT 3: As requested, this is a larger code piece.

data_buf_len = cur_stream->iframe_offset[cur_stream->iframe_num - 1] - first_offset;
data_buf = cur_stream->data;
output_buf = compose_reply(send_params, data_buf, data_buf_len, &output_buf_len);

/* Obviously, time measuring is *highly* unaccurate, only for
 * design consistency purposes (it should return something).
 * */
clock_gettime(CLOCK_REALTIME, &start_time);
total_bytes_sent = send(client_sd, output_buf, output_buf_len, MSG_NOSIGNAL);
clock_gettime(CLOCK_REALTIME, &stop_time);
spent_time = (((int64_t)stop_time.tv_sec * NANOSEC_IN_SEC) +
    (int64_t)stop_time.tv_nsec) - (((int64_t)start_time.tv_sec * NANOSEC_IN_SEC) +
    (int64_t)start_time.tv_nsec);

free(output_buf);
unload_video(cur_video);

if (total_bytes_sent < 0)
{
    log_message(MESSAGE, __func__, IMSG_VIDEOSTOP, cur_video->path);
    log_message(MESSAGE, __func__, IMSG_VIDEOSTOP, NULL);   
}

/* Hope it will not serve >2147483647 seconds (~68 years) of video... */
return ((int)spent_time);

Only a single send() call with a large buffer. There's another example, too large to put here, that divides each buffer in smaller chunks and calls send() for each one.

+1  A: 

That means you are writing to a socket or pipe which the other end has already closed. It's an application protocol error.

EJP
+4  A: 

As suggested already by EJP, EPIPE comes if the other side has closed the socket. Also, I don't think your logic of adding to "total_bytes_sent" whatever send function returns is correct, because, send might return -1 in some cases where you can still continue to operate (Ex: In case of Non-Blocking Socket, you might get an errno EAGAIN where you need to try again).

Also, if send returns 0 and errno is not EPIPE, then you will continously loop I guess.

EDIT: You can also check if a shutdown is being called on the socket. Even that can cause this behaviour.

Jay
I see. Yes, this is not a good idea, I guess... But will EPIPE be raised because of my server implementation, or it may be client's responsability only?
Manuel Abeledo
AFAIK, it will be because of client closing socket only. But, again, you should understand that it's your logic mistake which doesn't handle the peer connection closure correctly, so it should be handled in the server correctly.
Jay
A: 

If your are using a stream oriented socket, e.g. created using SOCK_STREAM, you should not have to send your data in chunks.

If you have all data available in output_buf, you should only need to write once on a blocking socket.

send(client_sd, output_buf, output_buf_len, MSG_NOSIGNAL);

If you have created your socket in non blocking mode, then you must use select and your loop above is wrong, apart from the fact that it does not handle the return value -1 as pointed out by Jay.

Regarding Nos comment:

From the POSIX standard:

send - send a message on a socket

ssize_t send(int socket, const void *buffer, size_t length, int flags);

...

If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does not have O_NONBLOCK set, send() shall block until space is available. If space is not available at the sending socket to hold the message to be transmitted, and the socket file descriptor does have O_NONBLOCK set, send() shall fail. The select() and poll() functions can be used to determine when it is possible to send more data.

...

So it is only when errors occur that send function call would not accept the entire message on a blocking socket.

Unfortunately it seems like some operating systems do actually return less than length bytes on send, even when errors do not occur. That is the reason for W. Richard Stevens libunp use of his own writen function.

Ernelli
Btw, is it possible to know how much free space a sending buffer has? Is there a kernel API or something?
Manuel Abeledo
I am actually about to start a separate SO thread regarding the behaviour of send on blocking sockets, since I find it quite strange that it may return without sending all data.
Ernelli