views:

505

answers:

3

We have an application that processes JMS message using a message driven bean. This application is deployed on an OC4J application server. (10.1.3)

We are planning to deploy this application on multiple OC4J application servers that will be configured to run in a cluster.

The problem is with JMS message processing in this cluster. We must ensure, that only a single message is being processed in the entire OC4J cluster at a single time. This is required, since the messages have to be processed in chronological order.

Do you know of a configuration parameter, that would control message processing across an OC4J cluster?

Or do you think we have to implement our own synchronisation code that will synchronise the message driven beans across the cluster?

A: 

First point: this is a pretty crappy design and you'll seriously limit performance only being able to process a single message at a time. I assume you are clustering only for fault tolerance, because you won't get performance improvements?

Are you using the default JMS implementation with OC4J or another one?

I've used IBM's MQ in the past and that had a feature that a queue could be marked as exclusive, which meant only one client could connect to it. This would appear to offer what you want.

An alternative would be to introduce a sequence ID (as simple as an incrementing counter) and the client processing the message would check that the sequence ID is the next expected value, if not then the message put back. This approach requires the different clients to persist the last valid sequence ID they've seen in some centrally shared data store, such as a database.

hbunny
We are using an oracle advanced queue as the backend.
Ladislav Petrus
A: 

I agree with stevendick: May be you're off track with the design. Regarding sequence ID or similar approachs I suggest you get insight on messaging architectures with Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions (by Gregor Hohpe y Bobby Woolf). It's a great book, plenty of useful patterns... I'm sure the forces and the problem you are facing are well described there.

JuanZe
+3  A: 

I've done sequential processing of messages in a cluster on a pretty large scale - 1.5 million+ message/day, using a combination of the Competing Consumers pattern and a Lease pattern.

Here's the kicker, though - your requirement that you can only process one trans at a time is going to keep you from achieving your goals. We had the same basic requirement - messages had to be processed in order. At least, we thought we did. Then we had an epiphany - as we gave the problem more thought, we realized that we didn't require total ordering. We actually required ordering only within each account. Therefore, we could distribute the load across the servers in a cluster by assigning ranges of accounts to different servers in the cluster. Then, each server was responsible to process messages for a given account in order.

Here's the second clever part - we used a Lease pattern do dynamically assign account ranges to various servers in the cluster. If one server in the cluster went down, another would grab the lease and take over the first server's responsibility.

This worked for us, and the process lived in production for about 4 years before being replaced due to a company merger.

Edit:

I explain this solution in more detail here: http://coders-log.blogspot.com/2008/12/favorite-projects-series-installment-2.html

Edit:

Okay, gotcha. You're already doing the processing at the level you need, but since you're being deployed to a cluster, you need to make sure that only one instance of your MDB is actively pulling messages from the queue. Plus, you need the simplest workable solution.

You don't need to abandon your MDB mechanism that you have now, I don't think. Essentially what we're talking about here is a requirement for a distributed lock mechanism, not to put too fancy a phrase to it.

So, let me suggest this. At the point where your MDB registers to receive messages from the queue, it should check the distributed lock, and see if it can grab it. The first MDB to grab the lock wins, and only it will register to receive messages. So, now you have your serialization. What form should this lock take? There are many possibilities. Well, how about this. If you have access to a database, its transactional locking already provides some of what you need. Create a table with a single row. In the row is the identifier of the server that currently holds the lock, and an expiration time. This is the server's lease. Each server needs to have a way to generate its unique identifier, perhaps the server name plus a thread ID, for example.

If a server can get update access to the row, and the lease is expired, it should grab it. Otherwise, it gives up. If it grabs the lease, it needs to update the row with a time in the near future, like five minutes or so, and commit the update. The active server should update the lease before it expires. I recommend updating it when there's half the time remaining, so, every 2-1/2 minutes if the lease expires in five. With this, you now have failover. If the active MDB dies, another MDB (and only one) will take over.

That should be pretty straightforward, I think. Now, you want to have the dormant MDBs check the lock occasionally to see if it's freed up.

So, the active MDB and the dormant MDBs all have to do something periodically. You might have them spawn a separate thread to do this. Many application engine vendors won't be happy if you do this, but adding one thread is no big deal, especially since it spends most of its time sleeping. Another option would be to tie into the timer mechanism that many engines provide, and have it wake up your MDB periodically to check the lease.

Oh, and by the way - make sure the server admins employ NTP to keep the clocks reasonably synced.

Don Branson
Thank you for the hint! While I was reading your response I remembered that we completely forgot about MDB lifecycle methods. I think we will be able to come up with a simple mechanism that will be dependent only on these methods to ensure that precisely one MDB is processing messages across the cluster. We will use locking mechanisms provided by the oracle database (dbms_lock). When we have a working solution I will post details.
Ladislav Petrus
Cool! Glad I could help.
Don Branson