tags:

views:

88

answers:

4

I have a search form used to query service provisions, which I will call 'provisions'. Provisions may have certain eligibility requirements in a has and belongs to many relationship, but they may also have no requirements at all. My issue is that when I exclude certain provisions on the basis of a particular requirement, those provisions that have no requirements at all are also excluded.

Table structure:

provisions
* id
* title

requirements
* id
* title

provisions_requirements
* provision_id
* requirement_id

Suppose a requirement is like:

Canadian Citizenship (id 4)

This would be presented on the form like: "Exclude services that have the requirement of: Canadian Citizenship".

The exclusionary portion of the query would thus be:

requirements.id NOT IN (4)

However, I then only get provisions that have at least one requirement, because those provisions that have no requirements are not included in the join.

The actual query is like:

SELECT DISTINCT `provisions`.id 
FROM `provisions` 
LEFT OUTER JOIN `provisions_requirements` 
  ON `provisions_requirements`.provision_id = `provisions`.id 
LEFT OUTER JOIN `requirements` 
  ON `requirements`.id = `provisions_requirements`.requirement_id 
WHERE (requirements.id NOT IN ('1')) 
LIMIT 0, 10
A: 

try this:

SELECT DISTINCT `provisions`.id 
FROM `provisions` 
LEFT OUTER JOIN `provisions_requirements` ON `provisions_requirements`.provision_id = `provisions`.id 
LEFT OUTER JOIN `requirements` ON `requirements`.id = `provisions_requirements`.requirement_id and (requirements.id NOT IN ('1'))
LIMIT 0, 10
KM
-1 This will find all requirements. The question is to find the ones that do not require Canadian citizenship.
Andomar
@Andomar, this will NOT find all requirements, only the requirements where requirements.id NOT IN ('1'), what ever that means. This is right from his example query, he talks about id 4 in the text, but his query example is for ID 1, I used that
KM
If the provision has both requirement 1 and 4, it should not be in the result set I think.
Andomar
@Andomar, see my previous comment to you, he asks for "1" in his example query, but "4" in his text. My answer uses "NOT IN (1)", if they really wanted both, then code it "NOT IN (1,4)", big deal
KM
+3  A: 

So you want all services except those that have the requirement of Canadian Citizenship (req_id 4)?

Try this: it tries to match against the requirement_id of 4, but the condition in the WHERE clause makes it so only provisions that found no such match satisfy the condition.

SELECT p.id
FROM provisions p LEFT OUTER JOIN provisions_requirements r
 ON (p.id = r.provision_id AND r.requirement_id IN (4))
WHERE r.requirement_id IS NULL
LIMIT 0, 10;

No DISTINCT is necessary. No join to the requirements table is necessary.

Bill Karwin
+1 from me - I tend to favour this sort of clause on the join condition, rather than a provision such as 'WHERE x = y OR x IS NULL'
BrynJ
+1 though the formatting is icky (must restrain use of edit power)
Andomar
@Andomar: Thanks for the +1. FWIW, this isn't the formatting I'd use in actual code. And I support you using your own preferred code formatting standards. To each their own!
Bill Karwin
Brilliant, elegant solution, I love it (and it works, I implemented it).My final query is not as easy to read as yours, but that's because these queries are built with ActiveRecord - which insists on SELECT DISTINCT, etc.Thanks very much for the solution.
adriandz
@adriandz: I'm curious which ActiveRecord implementation you're using. Is this Ruby on Rails? Why does it insist on DISTINCT? Sounds like a good way to sort in temporary tables unnecessarily, which kills performance.
Bill Karwin
@BillKarwin yes, Ruby on Rails, 2.3.2 (latest version). Curiously enough, when I just reviewed the query running on my Macbook, it lacked the DISTINCT, and also queried provisions.* instead of provisions.id. It could just be different behaviour depending on the version of MySQL, or it could even be because I am also running the query through the will_paginate plugin. I have no clue.
adriandz
+1  A: 
SELECT
     P.id
FROM
     Provisions P
WHERE
     NOT EXISTS
     (
          SELECT
               *
          FROM
               Provision_Requirements PR
          WHERE
               PR.provision_id = P.id AND
               PR.requirement_id = 4
     )
Tom H.
Why is R joined on PR? R appears unused, or am I missing something?
Andomar
Heh... whoops... I originally wrote it with a join to Requirements until I realized that he was using the ID, which is in the many-to-many table anyway. I'll fix it
Tom H.
A: 

your query in 99% there, just also include a test for NULL

SELECT DISTINCT `provisions`.id 
FROM `provisions` 
LEFT OUTER JOIN `provisions_requirements` 
  ON `provisions_requirements`.provision_id = `provisions`.id 
LEFT OUTER JOIN `requirements` 
  ON `requirements`.id = `provisions_requirements`.requirement_id 
WHERE (requirements.id NOT IN ('1')) 
or requirements.id is NULL -- test for null
LIMIT 0, 10

this would include those rows with no entry in the provisions_requirements table

Patrick Taylor
-1 A provision with requirements '1' and '4' will match your query, though it shouldn't
Andomar