views:

617

answers:

6

Hi,

I am getting the error "Too many open files" after the call to socket in the server code below. This code is called repeatedly, and it only occurs just after server_SD gets the value 1022. so i am assuming that i am hitting the limit of 1024 as proscribed by "ulimit -n". What i don't understand is that i am closing the Socket, which should make the fd reusable, but this seems not to be happening.

Notes: Using linux, and yes the client is closed also, no i am not a root user so moving the limits is not an option, I should have a maximum of 20 (or so) sockets open at one time. Over the lifetime of my program i would expect to open & close close to 1000000 sockets (hence need to reuse very strong).

  server_SD = socket (AF_INET, SOCK_STREAM, 0);  
  bind (server_SD, (struct sockaddr *) &server_address, server_len)  
  listen (server_SD,1)  
  client_SD = accept (server_SD, (struct sockaddr *)&client_address, &client_len)  
  // read, write etc...   
  shutdown (server_SD, 2);  
  close (server_SD)

Does anyone know how to guarantee closure & re-usability ?

Thanks.

A: 

Perhaps your problem is that you're not specifying the SO_REUSEADDR flag?

From the socket manpage:

SO_REUSEADDR Indicates that the rules used in validating addresses supplied in a bind(2) call should allow reuse of local addresses. For PF_INET sockets this means that a socket may bind, except when there is an active listening socket bound to the address. When the listening socket is bound to INADDR_ANY with a specific port then it is not possible to bind to this port for any local address.

ire_and_curses
Yes i am using "SO_REUSEADDR" : int yes = 1; setsockopt(server_SD,SOL_SOCKET,SO_REUSEADDR,
A: 

Are you using fork()? if so, your children may be inheriting the opened file descriptors. If this is the case, you should have the child close any fds that don't belong to it.

Hasturkun
no, not using fork. my client processes are spawned using a system call, and are not inheriting anything from the server/parent.
A: 

This looks like you might have a "TIME_WAIT" problem. IIRC, TIME_WAIT is one of the status a TCP socket can be in, and it's entered when both side have closed the connection, but the system keeps the socket for a while, to avoid delayed messages to be accepted as proper payload for subsequent connections.

You shoud maybe have a look at this (bottom of page 99 and top of 100). And maybe that other question.

Florian
i have used : setsockopt(server_SD, SOL_SOCKET, SO_LINGER, with l set to 0.so no, the socket should not linger.Also, a look at netstat tells me that the connections are going away quickly, but 'ls /proc/<pid>/fd' tells me that the file descriptors are not being released.
TIME_WAIT sockets should not consume file descriptors if they have been closed by the process(es) which own them.
MarkR
Use SO_REUSEADDR instead of SO_LINGER unless you understand _exactly_ what both options do.
Kristof Provost
I Believe i do, Are you saying they should not be used concurrently ?
+1  A: 

From your description it looks like you are opening server socket for each accept(2). That is not necessary. Create server socket once, bind(2) it, listen(2), then call accept(2) on it in a loop (or better yet - give it to poll(2))

Nikolai N Fetissov
Correct, i am opening the server socket for each accept. However it is not possible to reorganize my program this way, i need to maintain a strict 1:1 coupling between ports and servers. Besides, this doesn't explain why the fd is failing to be freed.
Are you binding that server socket to different port every time?
Nikolai N Fetissov
A: 

One needs to close the client before closing the server (reverse order to my code above!)
Thanks all who offered suggestions !

A: 

Run your program under valgrind with the --track-fds=yes option:

valgrind --track-fds=yes myserver

You may also need --trace-children=yes if your program uses a wrapper or it puts itself in the background.

If it doesn't exit on its own, interrupt it or kill the process with "kill pid" (not -9) after it accumulates some leaked file descriptors. On exit, valgrind will show the file descriptors that are still open and the stack trace corresponding to where they were created.

Running your program under strace to log all system calls may also be helpful. Another helpful command is /usr/sbin/lsof -p pid to display all currently used file descriptors and what they are being used for.

mark4o