tags:

views:

3313

answers:

7

Given a table such as:

CREATE TABLE dbo.MyTestData (testdata varchar(50) NOT NULL) 

ALTER TABLE dbo.MyTestData WITH NOCHECK ADD CONSTRAINT [PK_MyTestData] PRIMARY KEY  CLUSTERED (testdata)

And given that we want a unique list of 'testdata' when we are done gathering items to be added from a list of external data with known duplicates... When performing an insert stored procedure should the procedure be written to test for existence or should it just allow for error? What's the most common practice? I've always performed the test for existence but was debating this last night...

CREATE PROCEDURE dbo.dmsInsertTestData @ptestdata VarChar(50)
AS
  SET NOCOUNT ON

  IF NOT EXISTS(SELECT testdata FROM dbo.MyTestData WHERE testdata=@ptestdata)
  BEGIN
    INSERT INTO dbo.MyTestData (testdata ) VALUES (@ptestdata)
  END

RETURN 0

or just capture/ignore PK violation errors when executing this one?

CREATE PROCEDURE dbo.dmsInsertTestData @ptestdata VarChar(50)
AS
  SET NOCOUNT ON
  INSERT INTO dbo.MyTestData (testdata ) VALUES (@ptestdata)
RETURN 0
A: 

I think most programmers would suggest avoiding the exception. I'm not sure from a performance standpoint in T-SQL, but in .NET for instance, i believe a thrown exception is more costly than an additional if/else statement.

alex
A: 

My concern with the first example you gave is that it isn't returning an error to the user. It can be fixed to do so, but I wouldn't use it unless it was returning an error.

If your concern between the two possibilties is performance on large tables, I suggest you test both of them and see if one is significantly faster than the other. If the if select is particularly complicated and the insert will need to happen by far the majority of the time, it is possible that simply letting it fail would be faster most of the time. If onthe other hand, the possibility of a bad input is high and the if is relatively uncomplicated as shown here, then the other process might be a better one. But only real testing on your real data structure and data and with your real queries can tell you which is the better one for performance as it may be differnt in deiffernt situaltions.

HLGEM
A: 

I believe it depends on the nature of the stored procedure. Basically, you should handle errors if you have something to do with them (or to encapsulate them for the clients of the procedure) and leave them propagating if you don't have anything to do with them and can't make it friendlier for other layers of the application.

If the stored procedure is designed to insert raw data, I think it should leave the application to handle the possible errors. If the stored procedure is designed as a layer of abstraction (and does a specific task as opposed to running a specific statement) and can either handle the error and do something with it or can report it in a graceful manner (for example, well defined error codes) to the application, it should do so. Otherwise, it should be up to the application to make sure it's not inserting duplicate data, not the database (the database has already enforced this with primary keys).

Mehrdad Afshari
A: 

In order to be user-friendly, it is often a good practice to perform the SELECT, and if the record already exists, offer the user the opportunity to view and/or edit it.

For example, if a user is adding a new Customer record, they might want to review the information that has already been displayed for that Customer. They might have additional information to add to the record, such as a phone number.

In this kind of scenario, refusing to add the record is less helpful than offering the ability to view the existing duplicate.

DOK
+2  A: 

Your check for errors (i.e. "IF NOT EXISTS ...") may or may not work, because there's a potential race condition (if another transaction inserts the record after your IF NOT EXISTS statement but before your INSERT statement).

Therefore, whether or not you check before, you ought to code your INSERT statement as if it might fail.

Whether you want to check as well (not, instead) is up to you and up to your UI.

ChrisW
If the database vendor implements Serializable Isolation properly, then setting the Isolatoon level to serializable prevents the race from happening (at the cost of concurrency of course)
Charles Bretana
A: 

I always do it in one statement:

INSERT INTO dbo.MyTestData (testdata ) VALUES (@ptestdata)
WHERE NOT EXISTS(SELECT 1 FROM dbo.MyTestData WHERE testdata=@ptestdata)
RedFilter
A: 

Ignore the error. It's much quicker.

Andrew