views:

665

answers:

3

If I have a list of items, say

apples
pairs
pomegranites

and I want to identify any that don't exist in the 'fruit' column in an SQL DB table.

  • Fast performance is the main concern.
  • Needs to be portable over different SQL implementations.
  • The input list could contain an arbitrary number of entries.

I can think of a few ways to do it, thought I'd throw it out there and see what you folks think.

A: 
if exists(select top 1 name from fruit where name in ('apples', 'pairs', 'pomegranates'))
  PRINT 'one exists'
scottm
That only says a fruit exists, not the list of fruits that DON'T exist.
Robert C. Barth
I don't see where he says he needs the rest of the fruits in the table, just whether or not one of this does exist.
scottm
Sorry, I need to identify the fruits that are in my list but don't exist in the table.
Brabster
+1  A: 

Make the search list into a string that looks like '|fruit1|fruit2|...fruitn|' and make your where clause:

where
  @FruitListString not like '%|' + fruit + '|%'

Or, parse the aforementioned string into a temp table or table variable and do where not in (select fruit from temptable). Depending on the number of items you're searching for and the number of items being searched, this method could be faster.

Robert C. Barth
+1 for the innovative Joel Spolsky solution! http://stackoverflow.com/questions/337704/parameterizing-a-sql-in-clause#337817
Bill Karwin
+6  A: 

Since the list of fruits you are selecting from can be arbitrarily long, I would suggest the following:

create table FruitList (FruitName char(30))
insert into FruitList values ('apples'), ('pears'), ('oranges')

select * from FruitList left outer join AllFruits on AllFruits.fruit = FruitList.FruitName
where AllFruits.fruit is null

A left outer join should be much faster than "not in" or other kinds of queries.

Kluge
+1 but FWIW you need each tuple parenthesized separately: VALUES ('apples'), ('pears'), ('oranges'), ...
Bill Karwin
Thanks @Bill. I don't use VALUES much, should have checked my reference first. Fixed.
Kluge
Probably the best option. if you don't want to create a table, you can create a table variable instead to do the same thing and it will be destroyed when your query ends.
scottm
Sneaky work around for inserting a bunch of test records when the values clause only lets you do one at a timeinsert mytable(field)select 'apples'union allselect 'Pears'union allselect 'oranges'
HLGEM