views:

406

answers:

5

I'm working on an application that is divided in a thin client and a server part, communicating over TCP. We frequently let the server make asynchronous calls (notifications) to the client to report state changes. This avoids that the server loses too much time waiting for an acknowledgement of the client. More importantly, it avoids deadlocks.

Such deadlocks can happen as follows. Suppose the server would send the state-changed-notification synchronously (please note that this is a somewhat constructed example). When the client handles the notification, the client needs to synchronously ask the server for information. However, the server cannot respond, because he is waiting for an answer to his question.

Now, this deadlock is avoided by sending the notification asynchronously, but this introduces another problem. When asynchronous calls are made more rapidly than they can be processed, the call queue keeps growing. If this situation is maintained long enough, the call queue will get totally full (flooded with messages). My question is: what can be done when that happens?

My problem can be summarized as follows. Do I really have to choose between sending notifications without blocking at the risk of flooding the message queue, or blocking when sending notifications at the risk of introducing a deadlock? Is there some trick to avoid flooding the message queue?

Note: To repeat, the server does not stall when sending notifications. They are sent asynchronously.

Note: In my example I used two communicating processes, but the same problem exists with two communicating threads.

+3  A: 

If the server is sending informational messages to the client, which you yourself say are asynchronous, it should not have to wait for a reply from the client. If they are not informational, in other words they require an answer, I would say a server should never send such messages to a client, and their presence indicates a poor design.

anon
You are right, but I am looking for a way to avoid flooding the message queue.
Dimitri C.
Which message queue exactly - the thread message queue implemented by Windows?
sharptooth
@Dimitri If you remove the server messages that wait, there should be no flood.
anon
@sharptooth No, we have implemented our own message queue in our IPC library. However, I was talking about message queues in general, no matter how they are implemented.
Dimitri C.
@Neil Butterworth: if the notifications are sent faster than the can be processed, flooding will occur eventually.
Dimitri C.
Well, that is a different question. If your server is slower than your clients, there is no solution apart from discarding messages.
anon
You are probably right. Regrettably, I fear this message filtering is a costly operation.
Dimitri C.
+2  A: 

Depending on how important these messages are you might want to look into Message Expiration, or perhaps a Message Filter, though it sounds like your architecture may be incorrect.

RichardOD
Our architecture largely works as follows: the server asynchronously sends state changes, which the thin client uses to keep the user interface up-to-date. However, for some notifications, the client needs to get additional info from the server. Do you think this is a wrong way of doing things?
Dimitri C.
You are right, part of my problem can be solved by filtering the messages on the queue. However, this filtering degrades performance. Also, it doesn't provide a solution for all cases. It feels like you still have to hope the queue doesn't flood.
Dimitri C.
+1  A: 

I would rather fix the logic in the server side. The message queue should not stall waiting for the answer. Rather have a state machine which can also receive those info queries while it is waiting for the answer from the client.

Of course you can still flood your message queue, but with TCP you can handle it pretty easily.

Makis
No, in my main example, the server does not stall when sending notifications (I know, maybe I have made my question too confusing).
Dimitri C.
The flooding problem exists on TCP level also: if you add packets to the TCP queue to fast, it'll flood; and what to do then?
Dimitri C.
As I said in my second paragraph, yes, it's still possible. But the other end of the TCP link should have a timeout for the packets. If it doesn't receive a reply it should re-send it. Of course you can always flood a machine, just pump enough requests to it (use a hundred clients if required).
Makis
As for not stalling, what do you mean by "However, the server cannot respond, because he is waiting for an answer to his question." To me this sounds like it is stalling. If it is not, it should be able to respond to the info query, no?
Makis
(About the stalling issue) Yes yes, you are right, in my deadlock example the server *is* stalling, because it sent the notification synchronously. However, that problem is solved by sending it asynchronously (= non-blockingly), but it introduces the problem of queue flooding.
Dimitri C.
And queue flooding is something that is bound to happen if you can send more messages to the server than it has time to handle. As I said, that's what TCP is for. The TCP stack can detect it didn't receive the ACK from the server and resends the request after an interval. This will slow down messaging but will keep it running and no packets are lost. Server side will simply discard any messages that do not fit into the queue automatically and not send any notifications. This is in the core of the TCP protocol, something it was designed for.
Makis
Again, you are right about TCP, only it moves the problem to the one that tries to send something over TCP. What to do if the outgoing TCP queue is full?
Dimitri C.
Then you need to wait until it clears. The only way that the outgoing queue could fill that I can think of is if the other end never replies and you don't have a max number for retransmissions (after which the attempt is abandoned). Check out http://www.uic.rsu.ru/doc/inet/tcp_stevens/tcp_time.htm and especially chapter 23.
Makis
+3  A: 

If you have a constant congestion problem, there is little you can do other than gracefully fail and notify the client that no new messages can be posted; then it is up to the client to maintain a backlog of messages to be posted.

Introducing a priority queue and using message expiration/filtering could allow you to free up space in the queue, but that really just postpones the problem. If possible, you could also aggregate messages or ignore duplicate messages, but again the problem does not seem to be the queue itself. (Not to mention that the more complex queue logic could eat up valuable resources that would be better used actually processing messages.)

Depending on what the server side does, you could introduce result hashing for long computations, offload some types of messages to a dedicated device, check if the server waits unreasonably long for I/O operations, and a myriad of other techniques. Profile if possible, at least try to find out which message(s) causes congestion.

Oh, and the business solution: Compare cost of estimated development time to the cost of better hardware and conclude that you should just buy a more powerful server (or an additional one).

Christoffer
Great answer! Thanks a lot!
Dimitri C.
+1  A: 

The best way, I believe, would be to add another state to your client. This I borrowed from the SMPP protocol specs.

Add a congestion state to the client, whereby it always checks the queue length, assuming this is possible, and therefore once a certain threshold is attained, say 1000 unprocessed messages, the client sends the server a message indicating that it's congested and the server will be required to cease all messaging until it receives a notification indicating that the client is no longer congested.

Alternatively, on the server side, if there is a certain number of pending replies, the server could simply cease sending messages until the client replies a certain number of them.

These thresholds can be dynamically calculated or fixed, depending.....

partoa
Great answer! Temporizing the sender seems the best option. However, I think this will make the application even more complex :-(
Dimitri C.
All you need is a little control engineering which may be complex but worth while in the long run.Depending on your current implementation it may be simple.E.g. If one server thread/process sends messages and the other accepts replies. And, for instance, you have a database of all sent messages with their respective keys, all you would have to do is add a field for replied messages and make the sender process sleep if the pending replies are > x, while the receiver process updates this field on receipt of every message.
partoa