views:

1265

answers:

3

I have two tables in an MS SQL Server 2005 database, parent and child, where the parent may be related to many child records. [Child.parent_id] is related to [parent.id]. The child table also has column [foo] I need to bring back all records in the parent table where [child.foo] matches on each of one to many parameters. For example I would like all parent records that have a [child.foo] value of 'fizz' and a [child.foo] value of 'buzz.' I have tried the below query, but it is returning records that match on only one.

SELECT     Parent.ID
FROM         Parent 
INNER JOIN Child ON Parent.ID = Child.parent_id
WHERE     (Child.foo = 'fizz')
UNION ALL
SELECT     Parent_1.ID
FROM         Parent AS Parent_1 
INNER JOIN Child AS Child_1 ON Parent_1.ID = Child_1.parent_id
WHERE     (Child_1.foo = 'buzz')
+3  A: 

I believe you want something like this.

Select 
    Parent.Id
From Parent
    inner join Child on Parent.Id = child.parent_id
Where 
    Child.foo = 'fizz' or Child.foo = 'buzz'
Group By
    Parent.Id
Having
    count(distinct Child.foo) > 1

Here's the test script:

Create Table #parent ( id int )
Create Table #child ( parent_id int, foo varchar(32) )

insert into #parent (id) values (1)
insert into #parent (id) values (2)
insert into #parent (id) values (3)

insert into #child (parent_id, foo) values (1, 'buzz')
insert into #child (parent_id, foo) values (2, 'buzz')
insert into #child (parent_id, foo) values (3, 'buzz')
insert into #child (parent_id, foo) values (1, 'fizz')


Select 
    #parent.Id
From #parent
    inner join #child on #parent.id = #child.parent_id
Where 
    #child.foo = 'fizz' or #child.foo = 'buzz'
Group By
    #parent.Id
Having
    count(distinct #child.foo) > 1        

drop table #parent
drop table #child

Returns only Id 1.

Joel Potter
This query would fail if a given Parent had, for example, two (or more) children with a foo value of 'fizz'
mjv
@mjv: You are correct. Replace the `count(*)` with `count(distinct Child.foo)` to correct that.
Joel Potter
@Joel, right on. I edited my answer to point to your -now fixed- solution, as I find it more generic and more [automatically] optimizable. The query plan however on this type of query would greatly depend on specific SQL implementation and specific statistical profile of the data.
mjv
@mjv: I'm uncertain about the performance of grouping vs. joining. If you have benchmarks that would be great to see. You're probably right about the specific RDBMS being a factor though.
Joel Potter
A: 

This should get the results you want:

SELECT p.ID
FROM Parent p
WHERE EXISTS
(
    SELECT 1 FROM Child c WHERE c.parent_id = p.ID AND c.foo IN ('fizz','buzz')
)
Daniel Pratt
Although the question is ambigous, I think the OP would like the Parents which have both a 'fizz' `AND` a 'buzz' child. The query above will return parents that have either a 'fizz' `OR` a 'buzz' child.
mjv
Ah, you're probably right. As such, I'm up-voting your answer.
Daniel Pratt
+4  A: 

This will return all Parent records which have [at least] one child with a 'fizz' foo AND [at least] one child with 'buzz' foo. Which is what I think is required in the question.

Also, while being potentially sub-optimal, this query is generic in a sense that it will work with most SQL implementation, not only the more modern ones, where the support of CTE, subqueries and related constructs are supported.

   SELECT DISTINCT Parent.ID
    FROM Parent
    JOIN Child C1 ON Parent.ID = C1.parent_Id
    JOIN Child C2 ON Parent.ID = C2.parent_id
    WHERE C1.foo = 'fizz'
      AND C2.foo = 'buzz'

Edit:
Now that Joel Potter has fixed the query in his answer, we probably agree that his approach has several advantages over the query listed above (please give him a few +reps). In particular:

  • the structure of the query doesn't change when we add, or remove, targeted values for the column foo.
  • the query is probably more easily optimized [by the server itself]
  • the structure of the query allows it to handle variations on the definition of the filter. For example we can query all Parents which have children such that say 2 of 5 possible values of foo are present.

Following is Joel's query here, slightly modified, to show how it can be expanded for more than 2 foo values.

SELECT Parent.Id
FROM Parent
INNER JOIN Child on Parent.Id = child.parent_id
WHERE Child.foo IN ('fizz', 'buzz')  -- or say, ... IN ('fizz', 'buzz', 'bang', 'dong')
GROUP BY Parent.Id
HAVING COUNT(DISTINCT Child.foo) = 2  -- or 4 ...  
mjv