views:

221

answers:

2

I started using ZeroMQ this week, and when using the Request-Response pattern I am not sure how to have a worker safely "hang up" and close his socket without possibly dropping a message and causing the customer who sent that message to never get a response. Imagine a worker written in Python who looks something like this:

import zmq
c = zmq.Context()
s = c.socket(zmq.REP)
s.connect('tcp://127.0.0.1:9999')
while i in range(8):
    s.recv()
    s.send('reply')
s.close()

I have been doing experiments and have found that a customer at 127.0.0.1:9999 of socket type zmq.REQ who makes a fair-queued request just might have the misfortune of having the fair-queuing algorithm choose the above worker right after the worker has done its last send() but before it runs the following close() method. In that case, it seems that the request is received and buffered by the ØMQ stack in the worker process, and that the request is then lost when close() throws out everything associated with the socket.

How can a worker detach "safely" — is there any way to signal "I don't want messages anymore", then (a) check whether one last message arrived during transmission of the signal, (b) handle that message if one came in, and then (c) execute close() with the guarantee that no messages are being thrown away?

Edit: I suppose the raw state that I would want to enter is a "half-closed" state, where no further requests could be received — and the sender would know that — but where the return path is still open so that I can check my incoming buffer for one last arrived message and respond to it if there is one sitting in the buffer.

A: 

Try sleeping before the call to close. This is fixed in 2.1 but not in 2.0 yet.

Trey Stout
@Trey Stout: Synchronisation-by-sleeping scares me. Do you know how this was fixed in 2.1? Did they add an option for half-closed sockets, or does closing tell the sender of unprocessed messages that they need to retransmit elsewhere?
Jack Kelly
Trey, I see that in 2.1 `close()` will not destroy queued outgoing messages. But I see nothing about incoming messages, at least not on the little summary page I am looking at. Could you point us to the changelog or docs at the right place?
Brandon Craig Rhodes
Sorry guys I have the same info you do. I'm not a contributor to 0mq. As far as I know 2.1 just flushes on close instead of immediately closing.
Trey Stout
+1  A: 

I've been thinking about this as well. You may want to implement a CLOSE message which notifies the customer that the worker is going away. You could then have the worker drain for a period of time before shutting down. Not ideal, of course, but might be workable.

bneal
Yes, that would be possible. But my hope was to use a REQ customer that did not even know how many servers I had set up to load-balance his requests. Having to move to XREQ and implement my own subscription and de-subscription actions is something I hope to avoid!
Brandon Craig Rhodes