views:

207

answers:

3

Hello,

I'm currently designing a multi-client / server application. I'm using plain good old sockets because WCF or similar technology is not what I need. Let me explain: it isn't the classical case of a client simply calling a service; all clients can 'interact' with each other by sending a packet to the server, which will then do some action, and possible re-dispatch an answer message to one or more clients. Although doable with WCF, the application will get pretty complex with hundreds of different messages.

For each connected client, I'm of course using asynchronous methods to send and receive bytes. I've got the messages fully working, everything's fine. Except that for each line of code I'm writing, my head just burns because of multithreading issues. Since there could be around 200 clients connected at the same time, I chose to go the fully multithreaded way: each received message on a socket is immediately processed on the thread pool thread it was received, not on a single consumer thread.

Since each client can interact with other clients, and indirectly with shared objects on the server, I must protect almost every object that is mutable. I first went with a ReaderWriterLockSlim for each resource that must be protected, but quickly noticed that there are more writes overall than reads in the server application, and switched to the well-known Monitor to simplify the code.

So far, so good. Each resource is protected, I have helper classes that I must use to get a lock and its protected resource, so I can't use an object without getting a lock. Moreover, each client has its own lock that is entered as soon as a packet is received from its socket. It's done to prevent other clients from making changes to the state of this client while it has some messages being processed, which is something that will happen frequently.

Now, I don't just need to protect resources from concurrent accesses. I must keep every client in sync with the server for some collections I have. One tricky part that I'm currently struggling with is the following:

  • I have a collection of clients. Each client has its own unique ID.
  • When a client connects, it must receive the IDs of every connected client, and each one of them must be notified of the newcomer's ID.
  • When a client disconnects, every other client must know it so that its ID is no longer valid for them.
  • Every client must always have, at a given time, the same clients collection as the server so that I can assume that everybody knows everybody. This way if I'm sending a message to client #1 telling "Client #2 has done something", I know that it will always be correctly interpreted: Client 1 will never wonder "but who is Client 2 anyway?".

My first attempt for handling the connection of a new client (let's call it X) was this pseudo-code (remember that newClient is already locked here):

lock (clients) {
  foreach (var client in clients) {
    lock (client) {
      client.Send("newClient with id X has connected");
    }
  }
  clients.Add(newClient);
  newClient.Send("the list of other clients");
}

Now imagine that in the same time, another client has sent a packet that translates into a message that must be broadcasted to every connected client, the pseudo-code will be something like this (remember that the current client - let's call it Y - is already locked here):

lock (clients) {
  foreach (var client in clients) {
    lock (client) {
      client.Send("something");
    }
  }
}

An obvious deadlock occurs here: on one thread X is locked, the clients lock has been entered, started looping through the clients, and at one moment must get Y's lock... which is already acquired on the second thread, itself waiting for the clients collection lock to be released!

This is not the only case like this in the server application. There are other collections which must be kept in sync with the clients, some properties on a client can be changed by another one, etc. I tried other types of locks, lock-free mechanisms and a bunch of other things. Either there were obvious deadlocks when I'm using too much locks for safety, or obvious race conditions otherwise. When I finally find a good middle point between the two, it usually comes with very subtle race conditions / dead locks and other multi-threading issues... my head hurts very quickly since for any single line of code I'm writing I have to review almost the whole application to ensure everything will behave correctly with any number of threads.

So here's my final question: how would you resolve this specific case, the general case, and more importantly: aren't I going the wrong way here? I have little problems with the .NET framework, C#, simple concurrency or algorithms in general. Still, I'm lost here. I know I could use only one thread processing the incoming requests and everything will be fine. However, that won't scale well at all with more clients... But I'm thinking more and more to go this simple way. What do you think?

Thanks in advance to you, StackOverflow people which have taken the time to read this huge question. I really had to explain the whole context if I want to get some help.

+4  A: 

If you're having problems with locking, race conditions, etc due to the multi-threading nature of you app, it would be hard for anyone to give an instant solution. These kind of problems can be very intermittent at best and can not always be easily reproduced. That makes it hard even for someone sitting right in front of all of the code. But I will offer an alternative, that is to consider using some kind of message queues as your publish-subscribe backbone. Using such architecture can help simplify a lot of your boiler-plate code. As I said, this might or might solve your problem instantly, but hopefully share a different approach with you.

Khnle
+1 I was going to suggest the same approach :)
Cory Grimster
+1 Me too! But just to add a little more detail, the basic mechanism is that you add incoming messages to a thread-safe queue, then you have a dedicated thread that processes the messages in the queue. This way you get single-threaded in-order message processing, which simplifies things. Obviously you can get pretty fancy with this, but getting it working correctly to begin with will stop your head burning. :)
chibacity
Agree - this seems like a classic pub-sub architecture. Reimplementing it all your self is a challenging task. Fun, but challenging. Have you considered using the System.Messaging namespace for this, or some other pub/sub solution like AMQP?
Rob Goodwin
pub/sub seems to be way more elaborated than what I anticipated, googling more about the subject currently.
Julien Lebosquain
+1  A: 

I really don't know anything about .NET, but I can share my few experiences with asynchronous programming in the C and Linux world.

First of all, take this with gallons and gallons of salt, but: using threads (rather than processes) is often a bad idea. Processes only share the information you want to share (through message passing), while threads share everything. Because you can't share every single object accessible by code with every thread, you have to explicitly indicate what is not shared by using locks and whatnot. Working with processes is often easier because you only have to specify what you do share. I can't remember where I read this, but someone compared multithreaded programming with the style of programming you'd have to follow on a system without memory management (e.g. DOS) or in an operating system kernel. That type of programming is often unnecessary in userspace because the OS and MMU (memory management unit) take care of that for you.

One example of a large, asynchronous program that doesn't use threads is PostgreSQL. In fact, on its Todo list, it is listed under "Features We Do Not Want" (see here). Granted, there may be cases now in the future where threads could speed up tasks (because they're cheaper to instantiate than processes), but they aren't (and won't be any time soon) used as the main vehicle of asynchronous programming in PostgreSQL.

An alternative to threads and processes is to simply use one thread and one process, but have an event loop and quick handlers. However, drawbacks to this approach include: * Your code has to be chopped up into pieces that don't sleep. Instead of calling a function that simply downloads a URL and returns a result, you have to supply a callback for when the result is ready and also have your main loop respond to events related to downloading a URL (e.g. a single packet arrived). * You might not be able to avoid sleeping, or it may be unduly difficult.

I would recommend the single-process, single-thread approach for a relatively simple daemon. However, if the role of that daemon starts getting large and the code gets complicated, it may be time to split it up into separate processes.

Joey Adams
Erlang comes to mind. Scalable lightweight processes that communicate using message passing semantics.
chibacity
While your answer is 100% correct and I upvoted it, in my case almost any object is shared due to the way clients can modify almost. The few resources that don't need sharing are either immutable or accessed by a single thread.
Julien Lebosquain
@Julien Lebosquain Consider a database server and its clients. The clients can arguably modify "just about anything" on the database. However, the client and server are well-separated and talk to each other through a protocol. Of course, it would be a lot of needless work to implement a SQL interface between your clients and server. You could use a simple encoding indicating the intention of the client to the server. Perhaps .NET even lets you pass functions bound to parameters (closures) to/from client and server.
Joey Adams
You're right, but the big difference here is that a database server don't push the information back to other connected clients when one has updated something. It just waits for another request from the clients, which is not applicable here since every client has to be up-to-date.
Julien Lebosquain
@Julien Lebosquain Au contraire. PostgreSQL, for one, has `NOTIFY` and `LISTEN` which can be used to wake up clients.
Joey Adams
+2  A: 

I mentioned Erlang in a previous comment and also queued message processing in another. Erlang is designed from the ground up to support highly concurrent, shared-nothing, message passing style systems.

http://en.wikipedia.org/wiki/Erlang_(programming_language)

Although I have never used it it anger, I've read the book (Programming Erlang), and really like the simple beauty of the concurrent message passing approach that it embodies. After doing a fair amount of complex multi-threaded development, I can appreciate the challenges that Erlang seeks to solve i.e. the complexities of shared resources and synchronization.

There is a C# project that seeks to embody the concepts of Erlang - Retlang:

http://code.google.com/p/retlang/wiki/GettingStarted

Never used it, but the message passing approach is definitely a good one, and could be a nice fit for what you are trying to achieve.

chibacity
Thanks, investigating Erland and Retlang right now.
Julien Lebosquain
Erlang is awesome. I'm currently writing some POCs and it seems to perfectly suit my needs. Many thanks.
Julien Lebosquain