views:

976

answers:

5

Hi, I have a function just like this:

static int
rcv_kern(int sock, void *buf, int len, struct sockaddr *addr,
     socklen_t *addrlen)
{
    struct timeval timeout = {1, 0};
    fd_set set;
    int status;

    FD_SET(sock, &set);
    if ((status = select(sock + 1, &set, NULL, NULL, &timeout)) == 0) {
     FD_ZERO(&set);
     fprintf(stderr, 
      "timeout while receiving answer from kernel\n");
     exit(1);
    } else if (status == -1) {
     FD_ZERO(&set);
     perror("recvfrom failed");
     exit(1);
    }
    FD_ZERO(&set);
    return recvfrom(sock, buf, len, 0, addr, addrlen);
}

which is used for receiving message from kernel space using netlink. But when I run it, the result always says the message that "timeout while reciving answer from kernel", from the source code, this causes by the reason that the "select" method always return '0'. I don't know the reason, who can give me some suggestions, thanks.

A: 

Not related to the timeout, but you need to FD_ZERO(&set) before FD_SET(sock, &set), otherwise the fd_set will be uninitialized and likely contain many set bits. Also, FD_ZERO() before exiting is fairly pointless.

Lance Richardson
what you have said is not the reason, I have change my code according your instructions, But it not works, thank you all the same.
Charlie Epps
A: 

I have research my code in the kernel space, I know that the kernel can't receive message from client using the method "skb_dequeue(&sk->sk_receive_queue)". I don't know how it happens.

Charlie Epps
A: 

For starters, you can find out what the actual error was by printing out strerror(errno) (printing errno is also wise) when the timeout occurs.

As for guessing what the problem might be in the absence of errno, note that there's no guarantee that there's anything to read; even if you got the socket through accept(2), it might just be a connection that was set up, but that the client didn't get around to writing to. Typically you don't do just one select(2); you want to have a single main loop that keeps calling select(2) until the program wants to quit, as timeouts may happen for just about any reason at just about any time.

Other possible problems:

  • The client can't connect.
  • You're failing to bind the socket properly.
  • You're forgetting to call listen(2) on the server's socket after calling bind(2).

If you're using IP sockets, you can look at your network traffic using Wireshark to see if the client is doing what you expect.

C Pirate
it is not an error, it is a timeout, I don't think errno will be set
shodanex
+2  A: 

Charlie,
A couple of things:

1) You should probably loop around your select() call and ONLY call recvfrom if FD_ISSET() returns true on your file descriptor.
2) Make sure your actual driver or kernel code that is sending on the netlink socket is actually writing/sending data to it. If not, then your function will time out if it doesn't receive data in 1 second. (that's what you set timeout to).

A couple of general comments... In Linux, when using the select() system call. the timeout data structure gets reset after each call, so if you change your code to loop around select, which you probably should.. you'll have to reset your timeout value for every iteration in the loop.

Also, if select times out, that doesn't necessarily mean it's an error. Remember, select is a nonblocking call. It'll just wait on the socket for the given 'timeout' period and return. If you're wanting to read from the file descriptor no matter what... meaning you want your recv_kern() function to block until there is data to return, then don't bother using select(). Just call recvfrom() directly on the file descriptor. This way your recv_kernel() function will block and only return after reading data that the kernel sent.


It's kind of hard to give more specific help here without knowing more about the context in how this code is being used. I'm assuming this is a custom kernel module you've written that is sending data up to userspace, correct?
Try changing your recv_kern() function to block (take select code out and just call recvfrom()). This way should be able to tell if your kernel driver is actually sending data up to userspace properly. If you're blocking on recvfrom() and nothing every comes back.. then you may also have a problem in your kernel driver.

Hope that helps.

Steve Lazaridis
Your answer helped me with my socket problem, unrelated to the question above.
Alan H
A: 

You should rewrite the function like this:

static int
rcv_kern(int sock, void *buf, int len, struct sockaddr *addr,
     socklen_t *addrlen)
{
    struct timeval timeout = {1, 0};
    fd_set set;
    int status;

    FD_ZERO(&set);
    FD_SET(sock, &set);
    if ((status = select(sock + 1, &set, NULL, NULL, &timeout)) == 0) {
        fprintf(stderr, 
                "timeout while receiving answer from kernel\n");
        exit(1);
    } else if (status < 0) {
        perror("recvfrom failed");
        exit(1);
    }
    if ((status = recvfrom(sock, buf, len, 0, addr, addrlen)) < 0) {
        perror("recvfrom error");
        exit(1);
    }
    if (status == 0) {
        fprintf(stderr, "kernel closed socket\n");
        exit(1);
    }
    return status;
}

Like someone else said, you need to call FD_ZERO before calling select. The other calls to FD_ZERO are superfluous. Also, you need to do full error checking.

Robert S. Barnes