ansaurus

Question

SQL: dealing with unique values in the table when UNIQUE key constraint isn't applicable

Answer 1

+1 A:

You could use a not exist condition:

insert  YourTable
        (url)
values  ('blah blah blah')
where   not exists
        (
        select  *
        from    YourTable
        where   url = 'blah blah blah'
        )

Andomar 2010-10-22 09:40:12

Are you sure this is atomic, i.e. if the table is large and two such queries execute at once, is it certain that two rows won't get inserted? (This is a genuine question, I'm not saying your code is wrong, I just don't know the answer..)

Adrian Smith 2010-10-22 09:45:08

@Adrian Smith: Good point, and you're probably right. This would require serializable isolation level to be reliable (range lock), and MySQL doesn't support that.

Andomar 2010-10-22 09:54:25

Answer 2

+1 A:

In my opinion the best way to handle it is to write a trigger. The trigger is going to check each value in the table to see whether they are equal and if yes, to raise an error. However, I don't think an URL will go beyond 1000 characters but if it does in your case, you should write a trigger to handle the uniqueness.

Ranhiru Cooray 2010-10-22 09:40:22

Answer 3

+5 A:

You could create an extra field which would be the hash of a url e.g. md5, and make that hash field unique. You can certainly be sure that the URL is unique then, and with almost 100% certainty you can insert a new URL if it isn't already there.

It is tempting to create a table lock, however creating a table lock will implicitly commit the transaction you are working on: http://www.databasesandlife.com/mysql-lock-tables-does-an-implicit-commit/

You could create a single-row table e.g. name mutex, type=InnoDB, insert a row into it, and do a select for update on that row to create a lock which is compatible with transactions. It's nasty but that's the way I do table locks in MySQL in my applications :(

Adrian Smith 2010-10-22 09:44:03

+1 for the hash idea. Even if you need to enforce utter uniqueness somehow, using a hash here will help narrow down the rows to a usable handful for a longer, slower string comparison.

Matt Gibson 2010-10-22 09:47:39

Hash will definitely enforce uniqueness (the same URL goes to the same hash so would generate a unique constraint violation on the hash) but it might be that two different URLs can't be inserted if they hash to the same value. But I'm sure with a full-length hash like md5 that's really really unlikely, I mean after all `git` uses hashes to identify all commits, if hash-collisions were likely that wouldn't work.

Adrian Smith 2010-10-22 09:51:22

ansaurus

tags:

views:

answers:

SQL: dealing with unique values in the table when UNIQUE key constraint isn't applicable

related questions