views:

69

answers:

3

Hi,

Here's another challenge for you SQL-gurus out there...

I wrote another question here on matching two result sets(http://stackoverflow.com/questions/1421842/how-to-match-compare-values-in-two-resultsets-in-sql-server-2008). This question is a sequel to that question and answer, but since its a different topic I'm creating it as a new question.

Quassnoi suggests the following solution to selecting all the users that meet the skills required by a given project:

SELECT * 
FROM   Users u  
WHERE  NOT EXISTS 
 ( 
 SELECT  NULL  
 FROM    ProjectSkill ps        
 WHERE   ps.pk_project = @someid               
 AND NOT EXISTS  
  (                
  SELECT  NULL                
  FROM    UserSkills us                
  WHERE   us.fk_user = u.id                        
          AND us.fk_skill = ps.fk_skill               
   )        
 )

This works great. But what if I wanted to create a single query, that lists all the projects of a specific day including their best matching user. I imagine the following format:

projectid        userid
----------       -----------
1                1234
2                5678
3                4321
4                8765

It is important, that a user is only suggested once in the list! ... because the projects are on the same day and Users must not be double-booked.

Now, I can imagine some SQL like:

SELECT p.id 'projectid', ( SELECT TOP 1 u.id 
        FROM   Users u  
        WHERE  NOT EXISTS 
        ( 
             SELECT  NULL  
             FROM    ProjectSkill ps        
             WHERE   ps.pk_project = p.id 
             AND NOT EXISTS  
                 (                
          SELECT  NULL                
          FROM    UserSkills us                
          WHERE   us.fk_user = u.id                        
                 AND us.fk_skill = ps.fk_skill               
         )        
         )
         ORDER BY rating DESC
      ) 'userid'
FROM Projects p 
WHERE startdate > @beginningofday 
      AND startdate < @endofday

But this (obviously) happily suggests the same user again and again, as long as he has the required skills for the projects.

Does anybody have a suggestion on how to keep track of which rows in the Users table are already matches earlier in the query? A use of a variable maybe? Or is there another smart way around this that I'm missing?

The query must run on SQL Server 2008.

Any help will be greatly appreciated! Thanks.

Regards Alex

A: 

It has been a long time, but i actually think a cursor (I hate cursors-they are usually used wrong) could help here.

Another thing you could consider is that a user actually has to be booked for a day. That way a "suggested user" could apear on several projects, since nothing is final yet. Also i would suggest more then one user for a project, so the person assigning them can pick what makes best sense.

Also actually booking a user on a project could also eliminate some confusion in the future. If a user is changing his skills the query results could change.

If you have an assigned used, your usecases get a lot clearer

Heiko Hatzfeld
A: 

This contains shades of the Traveling Salesman problem -- how to optimally fit N users across X projects (particuarly if you are looking for the best/most appropriate user for each project). With large enough values of N and X, the problem can become intractible (i.e. you won't solve it in your lifetime).

This would admittedly be an extreme case. Even so, I could see the requirements ballooning out as you try to add more and more functionality to the process. My point is, the problem you are trying to solve might not reasonably be solvable in T-SQL, let alone a single query. Producing a list of recommended users for each project and letting someone make the final decision might be wise here.

Philip Kelley
A: 

Your task is a classical example of rooks problem.

It cannot be efficiently solved in SQL.

There are some simple algorithms that work well if your workers are likely to have required skills (i. e. an unskilled worker is a rare exception rather than a rule).

However, you better use SQL to retrieve the limitations, i. e. which users fit (or don't fit) which projects, and feed them into a heuristic algorithm.

Quassnoi