views:

133

answers:

4

I'm wondering if there is a way to insert a record into a table only if the table does not already contain that record?

Is there a query that will do this, or will I need a stored procedure?

+5  A: 
IF NOT EXISTS 
    (SELECT {Columns} 
    FROM {Table} 
    WHERE {Column1 = SomeValue AND Column2 = SomeOtherVale AND ...}) 
INSERT INTO {Table} {Values}
Matthew Jones
-1 as this is **the** sureshot way of getting duplicate errors under high load. The SELECT and INSERT run at different times and there is nothing to prevent two concurrent threads from attempting to insert the same value. MERGE, as posted by Martin, is a proper solution
Remus Rusanu
+7  A: 

You don't say what version of SQL Server. If SQL Server 2008 you can use MERGE

NB: It is usual to use Merge for an Upsert which is what I originally thought the question was asking but it is valid without the WHEN MATCHED clause and just with a WHEN NOT MATCHED clause so does work for this case also. Example Usage.

CREATE TABLE #A(
 [id] [int] NOT NULL PRIMARY KEY CLUSTERED,
 [C] [varchar](200) NOT NULL)


    MERGE #A AS target
    USING (SELECT 3, 'C') AS source (id, C)
    ON (target.id = source.id)
    /*Uncomment for Upsert Semantics
       WHEN MATCHED THEN 
        UPDATE SET C = source.C */
    WHEN NOT MATCHED THEN    
        INSERT (id, C)
        VALUES (source.id, source.C);

In terms of execution costs the two look roughly equal when an Insert is to be done...

Link to plan images for first run

but on the second run when there is no insert to be done Matthew's answer looks lower cost. I'm not sure if there is a way of improving this.

Link to plan images for second run

Test Script

select * 
into #testtable
from master.dbo.spt_values

CREATE UNIQUE CLUSTERED INDEX [ix] ON #testtable([type] ASC,[number] ASC,[name] ASC)


declare @name nvarchar(35)= 'zzz'
declare @number int = 50
declare @type nchar(3) = 'A'
declare @low int
declare @high int
declare @status int = 0;



MERGE #testtable AS target
USING (SELECT @name, @number, @type, @low, @high, @status) AS source (name, number, [type], low, high, [status])
ON (target.[type] = source.[type] AND target.[number] = source.[number] and target.[name] = source.[name] )
WHEN NOT MATCHED THEN    
INSERT (name, number, [type], low, high, [status])
VALUES (source.name, source.number, source.[type], source.low, source.high, source.[status]);

set @name = 'yyy'

IF NOT EXISTS 
    (SELECT *
    FROM #testtable
    WHERE [type] = @type AND [number] = @number and name = @name)
    BEGIN
INSERT INTO #testtable
(name, number, [type], low, high, [status])
VALUES (@name, @number, @type, @low, @high, @status);
END
Martin Smith
I'm actually not sure which version I'm currently using. How would the merge work?
Mega Matt
@Mega - I've updated my answer with example of how `MERGE` could be used for this case. I'll check the execution stats to see if there is any difference between the 2 approaches.
Martin Smith
The problem is correctness, not performance. `IF NOT EXISTS (SELECT ...) INSERT` will cause duplicate errors under load, guaranteed.
Remus Rusanu
@Remus - Good point. Thanks I hadn't considered that aspect of it at all.
Martin Smith
+1  A: 

In short, you need a table guaranteed to provide you the ability to return one row:

Insert dbo.Table (Col1, Col2, Col3....
Select 'Value1', 'Value2', 'Value3',....
From Information_Schema.Tables
Where Table_Schema = 'dbo'
    And Table_Name = 'Table'
    And Not Exists  (
                    Select 1
                    From dbo.Table
                    Where Col1 = 'Foo'
                        And Col2 = 'Bar'
                        And ....
                    )

I've seen this variation in the wild as well:

Insert Table (Col1, Col2, Col3....
Select 'Value1', 'Value2', 'Value3'....
From    (
        Select 1 As Num
        ) As Z
Where Not Exists    (
                    Select 1
                    From Table
                    Where Col1 = Foo
                        And Col2 = Bar
                        And ....
                    ) 
Thomas
A: 

I have to vote for adding a CONSTRAINT. It's the simplest and the most robust answer. I mean, looking at how complicated the other answers are I'd say they're much harder to get right (and keep right).

The downsides: [1] it's not obvious from reading the code that uniqueness is enforced in the DB [2] the client code has to know to catch an exception. In other words, the guy coming after you might wonder "how did this ever work?"

That aside: I used to worry that throwing/catching the exception was a performance hit but I did some testing (on SQL Server 2005) and it wasn't significant.

egrunin