views:

70

answers:

1

I'm using SQL Server 2008.

I've got a column NVARCHAR(MAX) in a table which I want to make sure is unique. The table has 600,000 records and grows every day by 50,000 records.

Currently before adding an item to the table I check if it exists in the table and if not I insert it.

IF NOT EXISTS (SELECT * FROM Softs Where Title = 'example example example.')
BEGIN
INSERT INTO Softs (....)
VALUES (...)
END

I don't have a index on the Title column

Recently, I started getting timeouts when inserting items to the table.

What would be the correct way to maintain the uniques?

If it would really help I can change the NVARCHAR(MAX) to NVARCHAR(450)

+6  A: 

It's madness not to have an index.

It would help but the index key length can only be 900 bytes.

However, it's likely you already have duplicates because the potential for a 2nd EXISTS to run after the 1st EXISTS but before the 1st INSERT.

The index creation will tell you, and subsequently protect against this.

However, you can get errors under heavy load.

My favoured approach for high inserts/low duplicates is the JFDI pattern. Highly concurrent

BEGIN TRY
   INSERT etc
END TRY
BEGIN CATCH
    IF ERROR_NUMBER() <> 2627
      RAISERROR etc
END CATCH
gbn
thx , what kind of errors will i get ? "However, you can get errors under heavy load."
sharru
The same reason why you may have duplicates already: the only different being errors from a unique index
gbn
Do you mean that after creating an index when i will insert an duplicate item ill get an error because of duplicity? or you mean ill get other errors ?
sharru
error on duplicates (as you kind of expect)
gbn
Cool , so i can do the above and use the index on the checksum column.
sharru
@sharru - `checksum` can give collisions. This is less probable with `hashbytes` according to [this article](http://www.mssqltips.com/tip.asp?tip=1868)
Martin Smith