views:

275

answers:

5

My issue is that I want to run a query to check if the data I am using will be a duplicate of something already in the database, for example I have a Product Code and want to stop the same item being listed multiple times, and this Product Code is the Primary Key.
Can I avoid doing this:

SELECT ProductCode FROM tblProducts WHERE ProductCode = '12345'

And then checking for any returned records, or is there a more elegant solution, I don't mind using this method but it seems clunky to be. This is just some extra validation to my code as the insertion of a product should only happen once - but I want error checking on this as duplicates have been entered due to refresh or back button use (web-based system).

Wonder if this is the P = NP type problem or am I overthinking the problem?
My Database is MS SQL Server 2000 if this helps.

A: 

I guess it depends on how often this might happen.

If it happens extremely rarely, I would just trap the possibility of a violation and if one occurs, inform the user accordingly.

If it could occur fairly frequently, I would probably first check to see whether my new item is OK, and only if so actually to the insert. On the other hand, if that appears to happen fairly frequently, then there must be something wrong with your app logic, I'd say - you should really have a sure-fire way to create and assign primary keys that are unique (and thus never cause insert conflicts), e.g. by using a INT IDENTITY column or some other means.

Marc

marc_s
This problem is relatively rare and occurs in only a few situations. I may adjust the program logic but this occurs across multiple pages for an normal application the existing code would work fine - webbased seems to introduce problems like back button and refresh which a normal application would never face - and better ways of dealing with exceptions too!
RoguePlanetoid
+2  A: 

Simply allow SQL to test first before inserting

IF NOT EXISTS (SELECT * FROM tblProducts WHERE tblProducts = '12345')
    INSERT tblProducts (tblProducts, columnlist, ...)
    VALUES ('12345', valuelist, ...)

This is better wrapped in a stored procedure so it's self contained in the database (but everyone has their own view on this).

gbn
This would be my choice however the duplicate situation is an exception I want to handle, as the product codes are barcode numbers and their clash is detected earlier but this is an additional check I want to make.
RoguePlanetoid
+7  A: 

I would recommend just letting the SQL insertion fail and let SQL Server through a primary key violation error and handle it. SQL Server maintains an index and they have been doing it for years. I am pretty sure that their implementation is going to be many times better than most developers out there. If you are trying to program for that you will need to read in all the keys and keep them in an index somewhere. SQL Server is already doing that. So then you have double checking and double the amount of space needed.

The only time I would consider doing this if the connection to your database is very slow for some or other reason.

Otherwise let SQL Server do what it is good and you do the rest :)

uriDium
This method is recommended in "The Art of SQL". http://www.amazon.com/Art-SQL-Stephane-Faroult/dp/0596008945
jms
This seems the best option as the error occurs I just want it to stop there and this option allows this and an error message to be shown to indicate that a duplicate it being added.
RoguePlanetoid
This is an acceptable solution, except the poster asked about dealing with back buttons.
Nathan Feger
assume you mean 'let SQL Server _throw_ a primary key violation error'
nekomatic
SQL is going to do the lookup anyway, so you shouldn't double the effort by doing it yourself beforehand. Let the error occur.
johnnycrash
+1  A: 

You could just group by the column when you do the extract that way if it is in there duplicate you would only get the one row. If you are returning multiple columns just group by all the columns you are returning.

Avitus
A: 

Perhaps the problem is in the design of your web app? Are you making sure that data is mutated in your database only through http POST's. Then make sure that those page responses have an expire time of zero or now ( I cannot recall if it is date based or millisecond based). This way users who hit back will see the "Do you want to repost form data" message, or get a 'Page expired' link from the browser.

Next any page that accepts user input needs to be fault tolerant. It seems like your primary key is entered by your users. However, even if it is not it is reasonable to have a unique key that is entered, rather than computed data.

What is your generic strategy for handling that situation? (It is not p = np) This problem faces many many apps of the database genre. It seems other posters have offered some solutions to that problem.

Good luck

Nathan Feger