ansaurus

Question

SQL - identifying exact matches across multiple records.

Answer 1

+1 A:

You need a join. This will vary by SQL dialect. Something like:

select child.column1, child.column2 from (
  select column2 as parentsColumn2Value from child where column1='S1'
) as parentsColumn2Table
left join child on parentsColumn2Table.column2=child.column2

larson4 2009-05-18 23:12:15

Answer 2

+1 A:

Solving this for the general case of finding all the matching parents is more fun :) See note at the bottom for how this can be used for the simpler case (one of the parent keys is fixed).

I'll give what I think is a "proper" solution, but I don't have a copy of Access (or SQL Server) to hand to see if it works in there. (Yes, I have tested this against a DB here though...)

SELECT p1.column1, p2.column1
FROM parent p1 JOIN parent p2 ON p1.column1 < p2.column1
WHERE NOT EXISTS (SELECT 1
       FROM (SELECT c1.column1, c1.column2 FROM child c1 WHERE c1.column1 = p1.column1) c1f
            FULL OUTER JOIN
            (SELECT c2.column1, c2.column2 FROM child c2 WHERE c2.column1 = p2.column1) c2f
            ON c1f.column2 = c2f.column2
       WHERE c1f.column1 IS NULL OR c2f.column1 IS NULL
      );

So hopefully you can see how what I said above is tied together in this :) I'll try to explain...

The "outer" (first) select generates combinations of column1 values (p1.column1 and p2.column1). For each of the combinations, we list the rows in "child" for those values (these are c1f and c2f: c1f means "child 1 filtered") and do a FULL OUTER JOIN. Which is a comparatively rare construct, in my experience. We want to match up all the entries in c1f and c2f (using their column1 values), and find any on either side that doesn't have a corresponding entry on the other side. If there are any such non-matchers then they will manifest as rows from the join with a null for their column1 value. So the parent query selects only combinations of column1 values where no such rows in the subquery exist, i.e. every child row for p1's column1 value has a corresponding child row for p2's column1 value and vice versa.

So for instance, for the iteration where p1.column1 is 'S1' and p2.column2 is 'S3', that subquery (without aggregation) would produce:

 c1f__column1 | c1f__column2 | c2f__column1 | c2f__column2
--------------+--------------+--------------+--------------
 S1           | P1           | S3           | P1
 S1           | P2           |              |

and it's those nulls in the second row that flag this combination as not matching. Some twisty thinking involved, it's tempting to get fixated on finding matching combinations, when finding non-matching ones is easier.

As a final bonus, when I created some test tables for this, I made (column1,column2) the primary key of child, which just so happened to be exactly what you need to drive the full outer join of the filtered tables efficiently. Win! (So do note I haven't tried to cope with duplicate combinations in child... but you could just slap "distinct" in the c1f and c2f derivations)

NB based on Matt's comment, if one of your parent values was known (i.e. you just wanted to list all the parent values with the same children as S1) then you can just slap "and p1.column1 = 'S1'" on the end of this. But replace "parent p1 JOIN parent p2 ON p1.column1 < p2.column1" with just "parent p1, parent p2" in that case... remember that otherwise the query as written will only output half of all the possible pairs...

araqnid 2009-05-19 00:27:55

Answer 3

A:

This is an Access, rather than an SQL solution in that it makes use of a User Defined Function (UDF).

SELECT p.Column1, 
       ConcatList("SELECT Column2 FROM c WHERE Column1='S1'","|") AS S1, 
       ConcatList("SELECT Column2 FROM c WHERE Column1='" & [p].[Column1] & "'","|") AS Child, 
       Format([S1]=[Child],"Yes/No") AS [Match]
FROM p;

The UDF

   Function ConcatList(strSQL As String, strDelim, ParamArray NameList() As Variant)
   ''Reference: Microsoft DAO x.x Object Library
   Dim db As Database
   Dim rs As DAO.Recordset
   Dim strList As String

   Set db = CurrentDb

   If strSQL <> "" Then
       Set rs = db.OpenRecordset(strSQL)

       Do While Not rs.EOF
           strList = strList & strDelim & rs.Fields(0)
           rs.MoveNext
       Loop

       strList = Mid(strList, Len(strDelim) + 1)
   Else

       strList = Join(NameList, strDelim)
   End If

   ConcatList = strList

   End Function

Unfortunately, I do not believe Jet supports Full Outer Join, so a solution using only SQL is likely to be a little tedious, and require more information, such as whether S1 has a fixed number of entries in column2.

Remou 2009-05-19 07:44:42

Access supports full outer join if you're using a back end that supports it. ;)

David-W-Fenton 2009-05-20 02:17:55

You can't get me on that, I said Jet this time! :D

Remou 2009-05-20 06:22:22

Answer 4

A:

ACE/Jet may not support FULL OUTER JOIN directly but the workaround is simple enough i.e. just UNION ALL the LEFT JOIN, INNER JOIN and RIGHT JOIN respectively of the tables.

onedaywhen 2009-05-20 08:41:06

Answer 5

A:

Incidentally, i was searching for the same solution and have figured it out myself

select * from TableParent where id in 
  (select temp_parent.id from
    (select TableParent.*, count(exact_match.match_count) as exact_count 
     from TableParent
     inner join 
       (select Column1, count(*) as match_count from TableChild
        group by Column1
        having match_count = 2
       ) as exact_match on exact_match.Column1 = TableParent.Column1
     inner join 
       TableChild on TableChild.event_id = exact_match.Column1 where TableChild.Column2 in (P1,P2)
   group by TableParent.Column1
   having exact_count = 2) as temp_parent)

penger 2009-08-12 05:26:41

ansaurus

tags:

views:

answers:

SQL - identifying exact matches across multiple records.

related questions