views:

602

answers:

3

I have a MDB (Message driven bean) that receives messages with String which represent a word. Also I have a table in the database. The MDB should store in the table, the words and the number of times each word was received (counter).

The problem is that to get better performance the MDB started in many instances and when the same new word is received by different instances they both create the same row with count of 1.

To solve this I should make the word field unique and then the second instance will fail on commit, retransmitting the message, which will work, but may be problematic. Is it a good practice ?

Another solution is to merge these lines afterwards summing the counter. But what if another instance will increase the counter in the middle of the update.

What if two instances try to increase the counter ? @Version should be enough?

I'm not sure what is the proper solution here. How would you handle such cases ?

Also can you suggest some books about concurrency practices (not about the use of synchronized as I need to support J2EE and may run a cluster of application servers)?


Update: After reading more about EJB and JPA I suppose I want something like an locking entity. For example I can create a new table with only id and key columns and data like this:

ID | KEY
1  | WORDS_CREATE_LOCK

So that when I need to handle a new word I will do something like this (not exact code, not sure it will even compile):

// MAIN FUNCTION
public void handleWord(String wordStr) {
  Word w = getWord(wordStr);

  if (w == null)
    w = getNewOrSychronizedWord(wordStr);

  em.lock(w);
  w.setCounter(w.getCounter() + 1);
  em.unlock(w);
}

// Returns Word instance or null if not found
private Word getWord(String wordStr) {
  Word w = null;

  Query query = em.createQuery("select w from words as w where w.string = :wordStr order by w.id asc");
  query.setParameter("wordStr", wordStr);
  List<Word> words = query.getResultList();

  if (words.getSize() > 0)
    w = words.get(0);

  return w;
}

// Handles locking to prevent duplicate word creation
private Word getNewOrSynchronizedWord(String wordStr) {
  Word w = null;
  Locks l = em.find(WORDS_CREATE_LOCK_ID, Locks.class);
  em.lock(l);

  Word w = getWord(wordStr);

  if (w == null) {
    w = new Word(wordStr);
    em.persist(w);
  }

  em.unlock(l);
  return w;
}

So the question is will it work that way? And can I do the same without maintaining a DB Table with locking rows? May be some J2EE container locking mechanism ?

If it helps I'm using JBoss 4.2.


I have a new idea for this. I can create two MDBs:

1st MDB with many instances allowed, that will handle all the messages and if the word is not found will send the word to the second MDB

2nd MDB with only one instance allowed, will handle the messages serially and will allow creation of new word

The best part: no entire table/method/process locking, only row locking on counter update

How good is that ?

Thanks.

+1  A: 

This sounds like it needs to be solved within the database by choosing the correct transaction isolation level - repeatable read should be sufficient.

What you need is a book about databases, focusing on transactions.

Michael Borgwardt
The isolation level of REPEATABLE READ is not the solution, because when both instances run select queries for the same non-existent word they both return zero rows and nothing is locked. Afterwards each instance will do insert with the same word.SERIALIZABLE isolation level is better as is would lock the non-existent rows (not sure) with the new word but it's a big impact on performance, and may have problems with Oracle DB.
Vitaly Polonetsky
+1  A: 

Do you mean that multiple instances are processing the same message, or that the same word is used in different messages? If it is the same message, then you should be using a queue instead of a topic. This, of course does not solve the issue of the same word in multiple messages. For that case, you can follow the advice of @Michael Borgwardt and @Vitaly Polonetsky.

Another option, outside of the database, would be to have different MDB instances handle words starting with a set of letters. This could be easily accomplished with selectors. Then there would only be a single MDB handling any particular word, but processing is still split amongst multiple instances to increase performance. I am not claiming this is a better alternative, but just a different one that supports pretty simple queue based processing.

Robin
I've meant the same word in different messages and it's a queue MDB. The "another opinion" is great, but is there a J2EE way of doing selectors on MDB ? I know I can use a selector MDB that will forward to the more specific MDBs. But again I must be sure that the specific MDB has only one instance (how ? works in cluster ?).
Vitaly Polonetsky
@ Vitaly Polonetsky - The selector is a configuration aspect of the MDB's deployment, as is the number of instances (concurrent sessions of threads) it will allow. You can deploy the same MDB multiple times (different EARs), but simply have a different selector for each one and configured to only allow one session.
Robin
@Robin thanks, but can I have one instance by selector with many selectors on the same mdb? Or I must have as many classes as selectors?
Vitaly Polonetsky
@Vitaly Polonetsky - Not sure what you mean. You can only have one selector to each deployed MDB, since it is a deployment descriptor configuration item, but you can deploy the same MDB multiple times, each with its own selector and JNDI name. This eliminates the need for coding multiple MDB's (since they all do the exact same thing), but allows you to have each handle different messages.
Robin
+1  A: 

If you are looking for performance, no locking, etc. I would suggest to have another table: (word, timestamp). Your MDBs will just insert the word and the timestamp. Another process will count and update the table with the totals.

GClaramunt