views:

77

answers:

5

When two sets are given

s1 ={ a,b,c,d} s2={b,c,d,a}

(i.e)

TableA

Item
a
b
c
d

TableB

Item
b
c
d
a

How to write Sql query to display "Elements in tableA and tableB are equal". [Without using SP or UDF]

Output

Elements in TableA and TableB contains identical sets
+2  A: 

My monstrocity:

;with SetA as
(select 'a' c union
select 'b' union
select 'c') 
, SetB as 
(select 'b' c union
select 'c' union
select 'a' union 
select 'd'
) 
select case (select count(*) from (
select * from SetA except select * from SetB
union 
select * from SetB except select * from SetA
)t)
when 0 then 'Equal' else 'NotEqual' end 'Equality'
Denis Valeev
Subtree cost of 0.0178567 - a hair better than cmsjr's answer
OMG Ponies
My AEP says that the STC is 0.01129. But then again, I just ran the query as a whole, without preparing those sets beforehand.
Denis Valeev
OMG Ponies
+2  A: 

Could do it with EXCEPT and a case

select 
   case 
     when count (1)=0 
        then 'Elements in TableA and TableB contains identical sets' 
     else 'Nope' end from (
       select item from s1
      EXCEPT 
       select item from s2
) b
Nix
I like this solution. :)
Denis Valeev
s2 could have more items than s1, but this query would claim they are identical.
Peter
Some would argue the sets are still the same, you could then assert the counts if u wanted, or u could do row checksums if you are really bored.
Nix
+1: On SQL Server 2005, this query has a better subtree cost of 0.0000459 vs my version using INTERSECT (0.0000702). I imagine my versions' higher cost is due to the COUNT comparisons.
OMG Ponies
This query is incorrect. If s2 has all the rows to match s1 but has additional nonmatching rows, this query will still return that they have identical sets.
Emtucifor
-1 -- I agree with @Peter and @Emtucifor that this gives false positives.
onedaywhen
+5  A: 

Use:

SELECT CASE 
         WHEN   COUNT(*) = (SELECT COUNT(*) FROM a) 
            AND COUNT(*) = (SELECT COUNT(*) FROM b) THEN 'Elements in TableA and TableB contains identical sets'
         ELSE 'TableA and TableB do NOT contain identical sets'
       END
  FROM (SELECT a.col
          FROM a
        INTERSECT
        SELECT b.col
          FROM b) x 

Test with:

WITH a AS (
  SELECT 'a' AS col
  UNION ALL
  SELECT 'b'
  UNION ALL
  SELECT 'c'
  UNION ALL
  SELECT 'd'),
     b AS (
  SELECT 'b' AS col
  UNION ALL
  SELECT 'c'
  UNION ALL
  SELECT 'd'
  UNION ALL
  SELECT 'a')
SELECT CASE 
         WHEN   COUNT(*) = (SELECT COUNT(*) FROM a) 
            AND COUNT(*) = (SELECT COUNT(*) FROM b) THEN 'yes'
         ELSE 'no'
       END
  FROM (SELECT a.col
          FROM a
        INTERSECT
        SELECT b.col
          FROM b) x 
OMG Ponies
Very interesting info on the subtree costs.
Denis Valeev
That's 2 reads per table + 1 join. Does no one know how to use `FULL JOIN`?
Peter
@Peter: I ran your FULL JOIN option, there's an extremely wide performance margin between your FULL JOIN and using INTERSECT or EXCEPT (on SS2005 anyway). I imagine the slightly higher cost for INTERSECT (vs Nix's EXCEPT version) is due to the counts to ensure the proper msg is displayed.
OMG Ponies
@OMG Ponies - so are you saying FULL JOIN is better or worse performance?
Emtucifor
@Emtucifier: The FULL JOIN is worse based on the subtree cost value - see respective comments to each answer for details.
OMG Ponies
+2  A: 

Watch out, I'm gonna use a Cross Join.

Declare @t1 table(val varchar(20))
Declare @t2 table(val varchar(20))


insert into @t1 values ('a')
insert into @t1 values ('b')
insert into @t1 values ('c')
insert into @t1 values ('d')


insert into @t2 values ('c')
insert into @t2 values ('d')
insert into @t2 values ('b')
insert into @t2 values ('a')

select 
    case when 
    count(1) = 
    (((Select count(1) from @t1) 
    + (Select count(1) from @t2)) / 2.0) 
    then 1 else 0 end as SetsMatch  from 
@t1 t1 cross join @t2 t2 
where t1.val = t2.val
cmsjr
OMG Ponies
looks like an inner join to me.
Peter
@cmsjr: [CROSS JOIN is supported on SQL Server 2000](http://msdn.microsoft.com/en-us/library/aa259187%28SQL.80%29.aspx), so I guess you mean about how [INTERSECT and EXCEPT](http://msdn.microsoft.com/en-us/library/ms188055.aspx) are 2005+ functionality? Upgrade =)
OMG Ponies
@Peter I have no idea what you're talking about.
cmsjr
@OMG, I clearly deleted that comment and I don't appreciate you replying to it. My new comment was going to be as follows.
cmsjr
OMG @OMG, where is your sense of fun.
cmsjr
@cmsjr: Sorry, I was being cheeky. I don't see an issue with using the cross join on SS2K, but the performance difference is immense when using 2005+ functionality.
OMG Ponies
I was being cheeky too, no offense taken (or I hope given) at any point. The solution is clearly not performant, it just seemed like a valid if inefficient opportunity to whip out the ol' cross join.
cmsjr
`a cross join b where a.val = b.val` is equivalent to `a inner join b on a.val = b.val`, and will be rewritten as an inner join by the optimizer. check the execution plans, they will be identical.
Peter
That they are. Same execution plan if you drop the join altogether too. Once again, cross join antics are just a bust...
cmsjr
@cmsjr: Peter's answer is slightly faster, and 2000 compliant - best SS2K candidate so far.
OMG Ponies
Thanks for the heads up, have up-voted in his general direction (and yours)
cmsjr
That isn't actually a cross join as you know, and using the words CROSS JOIN when you don't intend one is unnecessary obfuscation.
Emtucifor
wow, uh thanks for reading all the comments before adding your own.
cmsjr
I always leave a comment when I downvote :)
Emtucifor
Well then how about this angle, since all the variants reduce to the same execution plan, and my actual intent is to compare all values to all other values, doesn't the cross join more explicitly express that intent better than the inner join or the absence of a join operator?
cmsjr
@cmsjr: +1 have some love, man. I like cross joins too.
Peter
+4  A: 

Something like this, using FULL JOIN:

SELECT
  CASE 
    WHEN EXISTS (
      SELECT * FROM s1 FULL JOIN s2 ON s1.Item = s2.Item
      WHERE s1.Item IS NULL OR s2.Item IS NULL
      )
    THEN 'Elements in tableA and tableB are not equal'
    ELSE 'Elements in tableA and tableB are equal'
  END

This has the virtue of short-circuiting on the first non-match, unlike other solutions that require 2 full scans of each table (once for the COUNT(*), once for the JOIN/INTERSECT).

Estimated cost is significantly less than other solutions.

Peter
My initial approach. I guess it's not fun enough! :)
Denis Valeev
Actual Subtree cost (for me) of 0.0178565 - *barely* lower than Denis's answer, slightly better alternative than cmsjr's answer ... [Consideration for cmsjr's question about SS2000 alternatives](http://msdn.microsoft.com/en-us/library/aa259187%28SQL.80%29.aspx).
OMG Ponies
Go @Peter! +1 for you.
cmsjr
Bah, late to the party, FULL JOIN was going to be my answer, too.
Emtucifor
@Emtucifor: and what a party it was! Must be the I-need-a-distraction time of day.
Peter