tags:

views:

122

answers:

4

I have two lists of ids. List A and List B. Both of these lists are actually the results of SQL queries (QUERY A and QUERY B respectively).

I want to 'filter' List A, by removing the ids in List A if they appear in list B.

So for example if list A looks like this:

1, 2, 3, 4, 7

and List B looks like this:

2,7

then the 'filtered' List A should have ids 2 and 7 removed, and so should look like this:

1, 3, 4

I want to write an SQL query like this (pseudo code of course):

SELECT id FROM (QUERYA) as temp_table where id not in (QUERYB)

+2  A: 

see Doing INTERSECT and MINUS in MySQL

ArsenMkrt
he is using MySQL, not MSSQL
dfa
I changed the reference :)
ArsenMkrt
+1  A: 

The query:

select id 
from ListA
where id not in (
    select id 
    from ListB)

will give you the desired result.

Rashmi Pandit
+3  A: 

Using classic SQL:

select [distinct] number
from list_a
where number not in (
    select distinct number from list_b
);

I've put the first "distinct" in square brackets since I'm unsure as to whether you wanted duplicates removed (remove either the brackets or the entire word). The second "distinct" should be left in just in case your DBMS doesn't optimize IN clauses.

It may be faster (measure, don't guess) with an left join along the lines of:

select [distinct] list_a.number from list_a
left join list_b on list_a.number = list_b.number
where list_b.number is null;

Same deal with the "[distinct]".

paxdiablo
I think in your second query you meant to use a left OUTER join
Dave Costa
Sorry, ignore my previous comment -- it's irrelevant.
Dave Costa
+1  A: 

I am not sure which way is the best. As my previous impression, the perforamnce could be very different depends on situtation and the size of the tables.

1.

select id 
from ListA
where id not in (
    select id 
    from ListB)

2.

select ListA.id 
from ListA
left join ListB on ListA.id=ListB.id
where ListB.id is null

3.

select id 
from ListA
where not exists (
    select * 
    from ListB where ListB.id=ListA.id)

The 2) should be the fastest usually, as it does inner join not sub-queries.

Some people may suggest 3) rather then 1) beause it use "exists" which does not read data from table.

Dennis Cheung