views:

29

answers:

1

Currently I have a query that returns parent data for parent records that have a subset of child table records equaling certain values. However, I want to narrow it to only return those parent records with children having certain values, but where those are the only child records belonging to given parent or where the number of child records don't exceed a given total.

Here's an example query that only gets me half way to where I need to be:

SELECT parent.item1, parent.item2, parent.index
FROM parent 
INNER JOIN child on parent.index = child.index 
WHERE child.value IN (11111111, 33333333)
GROUP BY parent.item1, parent.item2, parent.index
HAVING COUNT(child.value) = 2

Unfortunately this query returns ANY parent's data that has a subset of the identified values included with the "IN" statement. I only want the parent's data for parent records whose total child records don't exceed a certain number (or, in my case, don't exceed the number of values in the "IN" statement. Is there an easy way to accomplish this?

+1  A: 

The query as you asked for it is:

SELECT  parent.item1, parent.item2, parent.index
FROM    parent
INNER JOIN
        child
ON      child.index = parent.index
GROUP BY
        parent.item1, parent.item2, parent.index
HAVING  SUM(CASE WHEN child.value IN (1111111, 2222222) THEN 1 ELSE 0 END) = 2
        AND COUNT(*) <= 2

If you just want to ensure that children match all of the values in the IN list and none not in the list, use this (in SQL Server)

SELECT  *
FROM    parent p
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    (
                SELECT  value
                FROM    child c
                WHERE   c.index = p.index
                ) c
        FULL JOIN
                (
                SELECT  11111111 AS value
                UNION ALL
                SELECT  22222222
                ) q
        ON      c.value = q.value
        WHERE   c.value IS NULL OR q.value IS NULL
        )
Quassnoi
Wow... fast response - thx! In this case the parent should only have 2 child records. However, with this solution, I'm still getting data from a parent with 3 child records (having a subset, not total set of child records with values identified).
Don
@Don: in the original response, `4` was just a "certain number" to limit the children (`3` matches this). Try now.
Quassnoi
I found the first solution works with the HAVING clause coded as follows: HAVING SUM(CASE WHEN child.value IN (11111111, 33333333) THEN 1 ELSE 0 END) = 2 AND COUNT(*) = 2 -- this filters any parent out that has more than 2 child records
Don
Can I take this a step further and display the child.value in the final results with the noted parent data?
Don
@Don: which `child.value` of the two children matching? And why bothering the DB with them once they are guaranteed to be `1111111` and `3333333`?
Quassnoi
I'm using the first solution, with the change I noted in my comments for the HAVING clause. The second solution doesn't work (not sure if it is limitations of SQL 2000). I guess it isn't absolutely necessary to display the child values with the returned parent data. If not, that's okay. I do appreciate your help.
Don
@Don: it should work on `2000`, what error does it output? As for the child values: the query is designed so that it always returns two children, one with value `1111111` and another one with value `3333333`. These values are known beforehand. How do you want them to be returned? Please post the sample data and the dataset you'd like to get as an output.
Quassnoi
I get these errors: Server: Msg 156, Level 15, State 1, Line 8Incorrect syntax near the keyword 'FULL'.Server: Msg 170, Level 15, State 1, Line 13Line 13: Incorrect syntax near 'q'.
Don
@Don: try now .
Quassnoi
I can't post the data as it is proprietary and the parent and child table names and field names were represented by aliases in my post for the same proprietary reasons.However, the data I want back would be in the following columns and order:parent.item1, parent.item2, parent.index, [child.value(s)... maybe?]The first three columns of data are required with the fourth optional if even feasible.
Don
Example data would be parent.item1 = John, parent.item2 = cat, parent.index = 1 with Child Record #1 = child.index = 1, child.value = 11111111 and Child Record #3 = child.index = 1, child.value = 33333333.Example John has two pet cats -- the id for cat #1 = 11111111 and id for cat #2 = 22222222
Don
Your fix for the second solution runs without issues, but runs slower than the first solution... not sure why.
Don
Currently as the data tables are prototypes there are not other than the indices establishing the parent to child relationship there are not other keys or indices built that might enhance the speed of Solution #2 posted above. Thanks!
Don