tags:

views:

101

answers:

3

I recently had to solve this problem and find I've needed this info many times in the past so I thought I would post it. Assuming the following table def, how would you write a query to find all differences between the two?

table def:

CREATE TABLE feed_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT feed_tbl_PK PRIMARY KEY (code)

CREATE TABLE data_tbl
(
code varchar(15),
name varchar(40),
status char(1),
update char(1)
CONSTRAINT data_tbl_PK PRIMARY KEY (code)

Here is my solution, as a view using three queries joined by unions. The diff_type specified is how the record needs updated: deleted from _data(2), updated in _data(1), or added to _data(0)

CREATE VIEW delta_vw AS (
SELECT     feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 0 as diff_type
FROM         feed_tbl LEFT OUTER JOIN
                      data_tbl ON feed_tbl.code = data_tbl.code
WHERE     (data_tbl.code IS NULL)

UNION

SELECT     feed_tbl.code, feed_tbl.name, feed_tbl.status, feed_tbl.update, 1 as diff_type
FROM         data_tbl  RIGHT OUTER JOIN
                      feed_tbl ON data_tbl.code = feed_tbl.code
where (feed_tbl.name <> data_tbl.name) OR
(data_tbl.status <> feed_tbl.status) OR
(data_tbl.update <> feed_tbl.update) 


UNION

SELECT     data_tbl.code, data_tbl.name, data_tbl.status, data_tbl.update, 2 as diff_type
FROM         feed_tbl LEFT OUTER JOIN
                      data_tbl ON data_tbl.code = feed_tbl.code
WHERE     (feed_tbl.code IS NULL)

)
A: 
  • I would use a minor variation in the second union:

    where (ISNULL(feed_tbl.name, 'NONAME') <> ISNULL(data_tbl.name, 'NONAME')) OR (ISNULL(data_tbl.status, 'NOSTATUS') <> ISNULL(feed_tbl.status, 'NOSTATUS')) OR (ISNULL(data_tbl.update, '12/31/2039') <> ISNULL(feed_tbl.update, '12/31/2039'))

For reasons I have never understood, NULL does not equal NULL (at least in SQL Server).

wcm
A: 

You could also use a FULL OUTER JOIN and a CASE ... END statement on the diff_type column along with the aforementioned where clause in http://stackoverflow.com/questions/30985/querying-2-tables-with-the-same-spec-for-the-differences#31043

That would probably achieve the same results, but in one query.

hova
+2  A: 

UNION will remove duplicates, so just UNION the two together, then search for anything with more than one entry. Given "code" as a primary key, you can say:

edit 0: modified to include differences in the PK field itself

edit 1: if you use this in real life, be sure to list the actual column names. Dont use dot-star, since the UNION operation requires result sets to have exactly matching columns. This example would break if you added / removed a column from one of the tables.

select dt.*
from
  data_tbl dt
 ,( 
  select code
  from
    (        
    select * from feed_tbl
    union
    select * from data_tbl        
    )
  group by code
  having count(*) > 1    
  ) diffs  --"diffs" will return all differences *except* those in the primary key itself 
where diffs.code = dt.code
union  --plus the ones that are only in feed, but not in data
select * from feed_tbl ft where not exists(select code from data_tbl dt where dt.code = ft.code)
union  --plus the ones that are only in data, but not in feed
select * from data_tbl dt where not exists(select code from feed_tbl ft where ft.code = dt.code)
JosephStyons