views:

137

answers:

2

Hi,

I'm working on a employee booking application. I've got two different entities Projects and Users that are both assigned a variable number of Skills.

I've got a Skills table with the various skills (columns: id, name) I register the user skills in a table called UserSkills (with two foreign key columns: fk_user and fk_skill) I register the project skills in another table called ProjectSkills (with two foreign key columns: fk_project and fk_skill).

A project can require maybe 6 different skills and users when registering sets up their Skills aswell.

The tricky part is when I have to find users for my Projects based on their skills. I'm only interested in users that meet that have ALL the skills required by the project. Users are ofcause allowed to have more skilled then required.

The following code will not work, (and even if it did, would not be very performance friendly), but it illustrates my idea:

SELECT * FROM Users u WHERE 
    ( SELECT us.fk_skill FROM UserSkills us WHERE us.fk_user = u.id ) 
     >= 
    ( SELECT ps.fk_skill FROM ProjectSkills ps WHERE ps.fk_project = [some_id] )

I'm thinking about making my own function that takes two TABLE-variables, and then working out the comparisson in that (kind of a modified IN-function), but I'd rather find a solution that's more performance friendly.

I'm developing on SQL Server 2008.

I really appreciate any ideas or suggestions on this. Thanks!

+3  A: 
SELECT  *
FROM    Users u
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    ProjectSkill ps
        WHERE   ps.pk_project = @someid
                AND NOT EXISTS
                (
                SELECT  NULL
                FROM    UserSkills us
                WHERE   us.fk_user = u.id
                        AND us.fk_skill = ps.fk_skill
                )
        )
Quassnoi
OMG! You replied with the correct answer after just 2½ minutes! You are my hero! ;) This was my first question on stackoverflow, but surely not my last... Thank you, your help is much appreciated!
Zounds. Is there a website somewhere that spells out and clarifies the hows and whys behind this use of correlated subqueries?
Philip Kelley
`@Philip Kelley`: I'm currently writing a series of articles on `NOT IN` vs `NOT EXISTS` vs `LEFT JOIN / IS NULL` in different `RDBMS`'s.
Quassnoi
This answer worked great, but spawned another question: http://stackoverflow.com/questions/1423440/sql-how-to-keep-track-rows-already-matched-in-a-correlated-subquery Any suggestions are welcome! ;)
A: 
--  Assumes existance of variable @ProjectId, specifying
--  which project to analyze
SELECT us.UserId
 from UserSkills us
  inner join ProjectSkills ps
   on ps.SkillId = us.SkillId
    and ps.ProjectId = @ProjectId
 group by us.UserId
 having count(*) = (select count(*)
                     from ProjectSkills
                     where ProjectId = @ProjectId)

You'd want to test an debug this, as I have no test data to run it through. Ditto for indexing to optimize it.

(Now to post, and see if someone's come up with a better way--there should be something more subtle and effective than this.)

Philip Kelley