views:

112

answers:

3

Howdy Folks,

I have a SQLServer 2008 database in which I have a table for Tags. A tag is just an id and a name. The definition of the tags table looks like:

CREATE TABLE [dbo].[Tag](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [Name] [varchar](255) NOT NULL
 CONSTRAINT [PK_Tag] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX  = OFF, 
       STATISTICS_NORECOMPUTE  = OFF, 
       IGNORE_DUP_KEY = OFF,      
       ALLOW_ROW_LOCKS  = ON, 
       ALLOW_PAGE_LOCKS  = ON)
)

Name is also a unique index. further I have several processes adding data to this table at a pretty rapid rate. These processes use a stored proc that looks like:

ALTER PROC [dbo].[lg_Tag_Insert]
    @Name varchar(255)
AS


DECLARE  @ID int
SET @ID = (select ID from Tag where Name=@Name )

if @ID is null
    begin
            INSERT Tag(Name)
            VALUES (@Name)

            RETURN SCOPE_IDENTITY()

    end
else
    begin
        return @ID
    end

My issues is that, other than being a novice at concurrent database design, there seems to be a race condition that is causing me to occasionally get an error that I'm trying to enter duplicate keys (Name) into the DB. The error is:

Cannot insert duplicate key row in object 'dbo.Tag' with unique index 'IX_Tag_Name'.

This makes sense, I'm just not sure how to fix this. If it where code I would know how to lock the right areas. SQLServer is quite a different beast.

First question is what is the proper way to code this 'check, then update pattern'? It seems I need to get an exclusive lock on the row during the check, rather than a shared lock, but it's not clear to me the best way to do that. Any help in the right direction will be greatly appreciated. Thanks in advance.

A: 

The proper code would be:

  • In a SP, running preferably with serializable transaction
  • Make a select into the tags table first to retrieve an id
  • If null, insert.

The transaction isolation will make sure transactions are serialized.

Cache tags client side so you dont insert when the client already knows they are there. Performance impaft will be minimal.

It looks like you do that, so the only problem may be your transaction isolation level.

What you could do is:

  • Use separate connections for inserting tags.
  • When you get a duplicate error, ignore it, query for id, use id. As you are on a separate connection that does not matter.
TomTom
+1  A: 

I prefer output parameters (so I coded it that way), but this should preform fastest, with the fewest hits on the table:

ALTER PROC [dbo].[lg_Tag_Insert]
    @Name varchar(255)
   ,@ID   int  OUTPUT
AS
BEGIN TRY
    SET @ID=NULL
    INSERT Tag (Name) VALUES (@Name)
    SET @ID=SCOPE_IDENTITY()
END TRY
BEGIN CATCH
    SELECT @ID=ID from Tag where Name=@Name
END CATCH

IF @ID IS NULL
BEGIN
    RETURN 1
END

RETURN 0

GO
KM
A: 

I have found the best results in tables with heavy inserts to set the constraint to "Ignore Duplicates" and let the dupes fall to the floor while capturing the new inserts. for your sproc you could eliminate the test completely and return the SCOPE_IDENTITY() or null after the insert.

Steve Ledridge