ansaurus

Question

Record that is related to all records of other table in SQL

Answer 1

+3 A:

SELECT 
    p.name,
    COUNT(DISTINCT oo.obstacle_id) AS OBSTACLES_COMPLETED
FROM 
    person p

    JOIN person_overcomes_obstacle oo
    ON oo.person_id = p.person_id

GROUP BY 
    p.name

HAVING 
    COUNT(DISTINCT poo.obstacle_id) = (SELECT COUNT(id) FROM obstacle)

Adam Bernier 2010-07-18 21:25:14

+1 for the query. It's worth noting that this assumes a 1 to 1 relationship between person and each unique obstacle. That is to say, a person does not complete an obstacle twice.

Jason McCreary 2010-07-18 21:29:00

Thanks very much for the comment. The `DISTINCT` in line 3 (and also in the last line) will ensure that each completed obstacle is counted exactly once.

Adam Bernier 2010-07-18 21:35:13

This query is good, but it calculates a lot more than is asked for in the question. Alexandros just wanted to find out "if at least one person has overcome all obstacles" -- he didn't say he wanted to know who that person was.

Tom 2010-07-18 21:35:43

Thanks, Tom. I seem to have overshot the goal :-/

Adam Bernier 2010-07-18 21:36:31

+1: Only concern I'd raise is grouping on the name column - could fall victim to "John Smith"'s...

OMG Ponies 2010-07-18 22:36:06

Thanks for the solution Adam. I agonized over whether to accept yours or martin's. Yours performs best, martins is more general. I got tremendous value out of the relational division link so I had to accept that one. If I could accept both I would, please accept an upvote and my apologies. cheers.

Alexandros Marinos 2010-07-26 08:03:24

@Alexandros: cheers, and no need to apologize. Thanks for the information about the performance of the two.

Adam Bernier 2010-07-26 15:29:11

Answer 2

+1 A:

This SELECT should return the number of obstacles that nobody has overcome. If the count is zero then all the obstacles have been overcome by at least one person.

SELECT count(*)
  FROM (SELECT po.person_id, o.obstacle_id
          FROM Obstacle o
          LEFT OUTER JOIN Person_overcomes_obstacle po
            ON (o.obstacle_id = po.obstacle_id)) t
WHERE t.person_id IS NULL

You could also do this, for the same effect (and possibly better performance):

SELECT count(*)
  FROM Obstacle o
 WHERE NOT EXISTS (SELECT 1
                     FROM Person_overcomes_obstacle po
                    WHERE po.obstacle_id = o.obstacle_id)

EDIT: As pointed out in the comments, the above two queries only prove that there are no obstacles that nobody has overcome, and not that a single individual has overcome all obstacles.

That aside, it should still be possible to prove that a single user has overcome all obstacles without querying the Person table:

SELECT t.personid, count(*)
  FROM (SELECT DISTINCT po.person_id, o.obstacle_id
          FROM Obstacle o
          JOIN Person_overcomes_obstacle po
            ON (o.obstacle_id = po.obstacle_id)) t
 GROUP BY t.persion_id
 HAVING count(*) = (SELECT count(*)
                      FROM obstacle)

Tom 2010-07-18 21:31:44

+1 for a far better reading of the question.

Adam Bernier 2010-07-18 21:38:33

if we have 2, and 2 obstacles, and each person completes a different obstacle, this will still return 0, even though there is no one single person that has completed all obstacles.

mdma 2010-07-18 21:57:49

Are you sure these work? I'm not sure you're checking that it is the same person that overcame the obstacles? i.e. if all obstacles were overcome by 10 different people.

Martin Smith 2010-07-18 22:00:58

You guys are absolutely right! My queries are just proving that every obstacle has been overcome -- not that a single person has overcome them all.

Tom 2010-07-18 22:11:40

OMG Ponies 2010-07-18 22:34:39

Answer 3

+3 A:

I notice that I was the only person to use the natural table alias for Person_overcomes_obstacle!

You need a relational division query for this.

You can either count up the obstacles and the matching records in Person_overcomes_obstacle and return ones where the 2 numbers match or look at it another way as finding people for which there is no obstacle that they haven't overcome.

SELECT p.id, p.name /*Or use COUNT(*) or wrap in Exists 
                     if you don't care about ids and names*/
FROM Person p
WHERE NOT EXISTS
   (SELECT * FROM Obstacle o
    WHERE NOT EXISTS 
    (
       SELECT * FROM Person_overcomes_obstacle poo
       WHERE poo.person_id = p.id and o.id = poo.obstacle_id
    )
)

Martin Smith 2010-07-18 21:38:16

This SELECT will work, but you have to query the Person table. If we don't care who has overcome all the obstacles, and only that someone has, then we shouldn't need to access the Person table at all, and can get better performance by doing so.

Tom 2010-07-18 21:52:43

@Tom - But I think there might be an issue with your approach at least as it stands as it doesn't check that it is the *same* person.

Martin Smith 2010-07-18 22:04:56

That'll return an invalid reference error for `p.id` because you can access table aliases one level deep in subquery syntax.

OMG Ponies 2010-07-18 22:31:09

@OMG - Just realised that I hadn't even given it an alias! What RDBMS does that alias rule apply to? Definitely not SQL Server.

Martin Smith 2010-07-18 22:34:50

MySQL and SQL Server require that a table alias are defined for dervied tables/inline views and joining to the same table. My comment overlooked your typo, but the subquery issue remains a show stopper.

OMG Ponies 2010-07-18 22:38:57

@OMG - No SQL Server is definitely not bothered http://odata.stackexchange.com/stackoverflow/q/7611/

Martin Smith 2010-07-18 22:40:05

That's Azure (SQL Server 2008) - this isn't tagged as SQL Server related.

OMG Ponies 2010-07-18 22:43:51

Ah I just read your comment again. It is a 2 level correlated sub query so none of the options you said. Good enough for Celko good enough for me! http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division/

Martin Smith 2010-07-18 22:45:17

...although actually he attributes it to Chris Date in that article.

Martin Smith 2010-07-18 22:55:17

Thanks a lot, this was exactly what I needed.

Alexandros Marinos 2010-07-26 08:01:14

Answer 4

+1 A:

Use:

SELECT poo.person_id
    FROM PERSON_OVERCOMES_OBSTACLE poo
GROUP BY poo.person_id
    HAVING COUNT(DISTINCT poo.obstacle_id) = (SELECT COUNT(*) 
                                                                        FROM OBSTACLE)

This is an alternative that is less likely to perform well:

SELECT x.person_id
   FROM (SELECT poo.person_id,
                         COUNT(DISTINCT poo.obstacle_id) AS obs_overcome,
                         (SELECT COUNT(*) 
                             FROM OBSTACLE) AS obs_total
                 FROM PERSON_OVERCOMES_OBSTALCE poo
          GROUP BY poo.person_id) x
 WHERE x.obs_overcome = x.obs_total

In either case, you can join to the PERSON table to get more information if you want - or you could run a count on person_id from either query to know how many people completed all the obstacles recorded.

OMG Ponies 2010-07-18 22:25:22

ansaurus

tags:

views:

answers:

Record that is related to all records of other table in SQL

related questions