tags:

views:

1682

answers:

7

For work i need to write a tcp daemon to respond to our client software and was wondering if any one had any tips on the best way to go about this.

Should i fork for every new connection as normally i would use threads?

+5  A: 

It depends on your application. Threads and forking can both be perfectly valid approaches, as well as the third option of a single-threaded event-driven model. If you can explain a bit more about exactly what you're writing, it would help when giving advice.

For what it's worth, here are a few general guidelines:

  • If you have no shared state, use forking.
  • If you have shared state, use threads or an event-driven system.
  • If you need high performance under very large numbers of connections, avoid forking as it has higher overhead (particularly memory use). Instead, use threads, an event loop, or several event loop threads (typically one per CPU).

Generally forking will be the easiest to implement, as you can essentially ignore all other connections once you fork; threads the next hardest due to the additional synchronization requirements; the event loop more difficult due to the need to turn your processing into a state machine; and multiple threads running event loops the most difficult of them all (due to combining other factors).

bdonlan
The memory overhead of forking isn't too bad, due to the way that the process memory is duplicated copy-on-write. The main downside is the time taken to fork(), but if you're not creating and destroying connections with high frequency that shouldn't be a showstopper.
caf
True, but it can be a lot more than a small struct with a file descriptor and input buffer :)
bdonlan
+2  A: 

I'd suggest forking for connections over threads any day. The problem with threads is the shared memory space, and how easy it is to manipulate the memory of another thread. With forked processes, any communication between the processes has to be intentionally done by you.

Just searched and found this SO answer: What is the purpose of fork?. You obviously know the answer to that, but the #1 answer in that thread has good points on the advantages of fork().

hobodave
An additional benefit is that an aborting thread can get stranded or crash your entire application. As suggested in another answer, you can set up a pool of worker processes. Consider having a simple delegator process to queue up work that the worker processes cooperatively de-queue. This setup gives you better control of system resources.
Bill Hoag
+1  A: 

Apart from @hobodave's good answer, another benefit of "forking per connection" is that you could implement your server very simply, by using inetd or tcpserver or the like: you can then use standard input and standard output for communicating with the socket, and don't have to do any listening-socket management (listening for connections, etc.), etc.

Chris Jester-Young
+1  A: 

Another option, of course, is pre-forking several copies of the daemon and having each one staying alive and continuing to answer requests. It all depends on your application, expected load and performance requirements, among other things.

The easiest and simplest way is to write an inetd-based daemon; your software can ignore the fact that it is running over a TCP connection and simply handle input/output via stdin/stdout. That works well in the vast majority of cases.

Wilson
+1  A: 

If you're not planning to be hammered with many new connections per second, consider running from inetd. Otherwise...

Download the OpenSSH source. They've put a lot of work into the privilege separation just right, it's portable, and it's been scrutinized for security more than just about anything else.

Adapt it for your needs, you can probably throw out most of it. Comply with the license agreement of course. Follow future patches with a good SCC.

Don't worry about the performance of forking processes vs threads until you have good evidence it's a real issue. Apache went for years and years running the busiest sites with just the simple process-per-client model.

If you're really ambitious, you could use some kind of a non-blocking asynchronous IO model. I like Boost.Asio, but I'm heavy into C++.

Make sure your code handles signals correctly. HUP to reload configuration. TERM to shutdown gracefully.

Don't try to write your own log file. Use syslog only, or just write to stderr that can be redirected to syslog. It's a real pain trying to set up logrotate on home-rolled servers that all log slightly differently.

Marsh Ray
A: 

If you want to avoid threading / forking all together, I would recommend using all non-blocking I/O along with libevent. Libevent is fairly well known as a high performance solution for event driven programming.

James
A: 

Look into ACE (C++/Java). It has a number of threaded, event, and forking TCP reactors that address your communications requirements. You can also look into Boost ASIO which does something similar

Chris Kaminski