tags:

views:

103

answers:

1

The spec says "Acknowledging a consumed message automatically acknowledges the receipt of all messages that have been delivered by its session" - but what I need to know is what it means by 'delivered'.

For example, if I call consumer.receive() 6 times, and then call .acknowledge on the 3rd message - is it (a) just the first 3 messages that are ack'd, or (b) all 6?

I'm really hoping it's option a, i.e. messages after the one you called acknowledge on WILL be redelivered, otherwise it's hard to see how you could prevent message lost in the event of my receiver process crashing before I've had a chance to persist and acknowledge the messages. But the spec is worded such that it's not clear. I get the impression the authors of the JMS spec considered broker failure, but didn't spend too long thinking about how to protect against client failure :o(

Anyway, I've been able to test with SonicMQ and found that it implements (a), i.e. messages 'received' later than the message you call .ack on DO get redelivered in the event of a crash, but I'd love to know how other people read the standard, and if anyone knows how any other providers have implemented CLIENT_ACKNOWLEDGE? (i.e. what the 'de facto' standard is)

Thanks Ben

A: 

According to the spec, option (b) is the correct behavior. If you are getting option (a) and rely on it, the application will not be portable.

In the explanation below, I'm referring specifically to the JMS spec Version 1.1 April 12, 2002.

There is some confusion caused by the fact that the acknowledge method is called from the message object when in fact it operates at the session level. Because it is a message method, intuitively it seems correct that one could pick a point in the message stream at which to generate an acknowledgment.

What is really happening though is that message acknowledgment is driving commit calls at the session level. Since a session can have only one transaction active at a time, each acknowledgment delimits not a point in the message stream but rather a point in time. The ack commits the existing unit of work and starts the next. Messages delivered prior to the ack must be included in the unit of work that was committed by the ack.

The term "delivered" is generally thought of as completion of the API call that removes the message from the queue and results in a populated object in the program's memory. In reality, the message is considered delivered when it is removed from the queue, regardless of whether it makes it to the program. For example, consider the following sequence of events:

  1. The app requests a message.
  2. The request is passed over a TCP socket to a process on the server which acts as a proxy for the application.
  3. The proxy issues the GET against the queue.
  4. The message is locked in a unit of work and passed to the proxy process.
  5. The proxy process attempts to deliver the message over a TCP socket to the calling application. If the connection is severed at this point, the application will not have ever seen the message but the JMS provider thinks it has been delivered. When the broken connection is detected, the unit of work is canceled, the message is rolled back onto the queue and the redelivery count is incremented.
  6. The application receives the message.
  7. The application acknowledges the message.
  8. The proxy process receives the commit call and executes it.

There is a window between 6 & 8 during which the TCP socket can be disconnected. On the JMS provider side, it can't really distinguish between this and failure at Step 5. Either way, the message is rolled back and redelivered later. However in this case the application will see the message twice. The spec anticipates this situation in 4.4.13 where it states:

If a failure occurs between the time a client commits its work on a Session and the commit method returns, the client cannot determine if the transaction was committed or rolled back. The same ambiguity exists when a failure occurs between the non-transactional send of a PERSISTENT message and the return from the sending method.

It is up to a JMS application to deal with this ambiguity. In some cases, this may cause a client to produce functionally duplicate messages.

A message that is redelivered due to session recovery is not considered a duplicate message.

In your example where you "call consumer.receive() 6 times, and then call .acknowledge on the 3rd message" and are observing that messages 4 through 6 are redelivered, the possible explanations are that a) the six messages are not all from the same session, or b) the behavior is not compliant with the spec.

T.Rob
Thanks that's very helpful. Not the answer I was hoping for (!) but I had a feeling that might be the case. So was it just an accident that the acknowledge method got put on the Message object not Session? As you say, it does give a rather misleading impression.
Ben Spiller
I'd still be interested to know whether many other providers (in addition to SonicMQ) go beyond/against the standard by acknowledging only up to the message you call ack() on...Otherwise it seems that the limitations of CLIENT_ACKNOWLEDGE make it pretty useless for truly reliable messaging. In practice, does everyone end up either using heavyweight XA transactions or give up on exactly-once semantics and implement reliability at the application layer instead?
Ben Spiller
I suspect that the acknowledge method is on the message because of the models in which the session is container-managed and the message is the only object available to the application.Can't answer regarding other transport providers as my only knowledge of a specific provider is WebSphere MQ. Perhaps someone else will respond. Don't forget to accept/vote answers if you'd like to attract more comments.
T.Rob
XA scopes the transaction boundaries in time rather than at a point in the message stream, same as CLIENT_ACK. Any messages delivered before an XA commit are in the same unit of work when the COMMIT is called. I wouldn't say CLIENT_ACKNOWLEDGE is useless for reliable messaging since it implements the same model as XA, but I would be curious what requirement leads you to want to read 6 messages and ack only the first 3. I think the preferred method would be to use selectors or correlation ID to select just the ones of interest.
T.Rob
"XA scopes the transaction boundaries in time rather than at a point in the message stream" - that's very useful info, although once again my plans are thwarted! :o( All I want to do is reliably receive messages without loss if my client crashes (i.e. I need to either persist or fully process messages myself before I ack them to JMS and lose the chance of redelivery)... but it's going to be a big performance hit if I can't receive and process messages n+1, n+2, etc while messages '1..n' are being ackd - but if everything's based on time not msg stream pos I guess there's just no way to do that
Ben Spiller
My only suggestion is to maintain multiple sessions. I realize this is a bit heavy-handed but since sessions are scoped to a thread, multiple session instances allow for concurrent units of work to be held open.
T.Rob