views:

254

answers:

1

Hi,

I have a very simple scenario involving a database and a JMS in an application server (Glassfish). The scenario is dead simple:

1. an EJB inserts a row in the database and sends a message.
2. when the message is delivered with an MDB, the row is read and updated. 

The problem is that sometimes the message is delivered before the insert has been committed in the database. This is actually understandable if we consider the 2 phase commit protocol:

1. prepare JMS
2. prepare database
3. commit JMS
4. ( tiny little gap where message can be delivered before insert has been committed)
5. commit database

I've discussed this problem with others, but the answer was always: "Strange, it should work out of the box".

My questions are then:

  • How could it work out-of-the box?
  • My scenario sounds fairly simple, why isn't there more people with similar troubles?
  • Am I doing something wrong? Is there a way to solve this issue correctly?

Here are a bit more details about my understanding of the problem:

This timing issue exist only if the participant are treated in this order. If the 2PC treats the participants in the reverse order (database first then message broker) that should be fine. The problem was randomly happening but completely reproducible.

I found no way to control the order of the participants in the distributed transactions in the JTA, JCA and JPA specifications neither in the Glassfish documentation. We could assume they will be enlisted in the distributed transaction according to the order when they are used, but with an ORM such as JPA, it's difficult to know when the data are flushed and when the database connection is really used. Any idea?

+3  A: 

You are experiencing the classic XA 2-PC race condition. It does happen in production environments.

There are 3 things coming to my mind.

  1. Last agent optimization where JDBC is the non-XA resource.(Lose recovery semantics)
  2. Have JMS Time-To-Deliver. (Delibrately Lose real time)
  3. Build retries into JDBC code. (Least effect on fuctionality)

Weblogic has this LLR optimization avoids this problem and gives you all XA gaurantees.

+1 thanks for the answer. So there is not way to implement that simply without relying on app. server advanced optimizations? (The spec don't mandate the app. server to have time-to-deliver neither last agent optimization.)
ewernli
Btw, I was effectively informed about last agent optimization on glassfish forum. I didn't know about the advanced LLR variant, though. Regarding retry logic, we've actually circumvented the problem with `select * for update`, much simpler. I'm happy to hear it's a "classic" issue. Still, what I don't get is that the spec themselves don't address this issues, e.g. mandate that we can specify a preferred order for the participants.
ewernli