tags:

views:

45

answers:

3

Hi

I have some fairly simple requirements but I'm not sure how I implement them:

  • I have multiple concurrent threads running the same query
  • The query supplies a 'string' value - if it exists in the table, the query should return the id of the matching row, if not the query should insert the 'string' value and return the last inserted id
  • The 'string' column is (and must be) a text column (it's bigger than varchar 255) so I cannot set it as unique - uniqueness must be enforced through the access mechanism
  • The query needs to be in stored procedure format (which doesnt support table locks in MySQL)

How can I guarantee that 'string' is unique? How can I prevent other threads writing to the table after another thread has read it and found no matching 'string' item?

Thanks for any advice..

A: 

You have to create synchronized threads or synchronized resources on read/write operations, so a thread won't be allowed to read or write while another one is reading or writing.

As for the string you can do a "greedy" query like:

select distinct string from table where ...

the first time a Thread executes a select, then you cache the result in a HashMap or a similar Map and update it each time a Thread accesses the table to add rows. First you'll check if the HashMap exists then you execute the query. If the HashMap exists you can check if your string is found in it.

Lex
I tried exactly this initially but the table is massive, i just cannot cache the whole thing
MalcomTucker
+1  A: 

If you're sure that you can't use a DB constraint, then use a UNIQUE index on another field where you store a good crypto hash of the full string. I'm guessing MD5 or SHA1 should be enough. Several source code management systems (like Git, Mercurial, Monotone, and others) rely on the extremely low possibility of hash collision.

Javier
A: 

You can do a select for that text, if nothing is found then insert it, otherwise update. Wrap it all in a single transaction.

codymanix