views:

77

answers:

4

I have two tables, House and Person. For any row in House, there can be 0, 1 or many corresponding rows in Person. But, of those people, a maximum of one will have a status of "ACTIVE", the others will all have a status of "CANCELLED".

e.g.

SELECT * FROM House LEFT JOIN Person ON House.ID = Person.HouseID

House.ID | Person.ID | Person.Status
       1 |         1 |     CANCELLED
       1 |         2 |     CANCELLED
       1 |         3 |        ACTIVE
       2 |         1 |        ACTIVE
       3 |      NULL |          NULL
       4 |         4 |     CANCELLED

I want to filter out the cancelled rows, and get something like this:

House.ID | Person.ID | Person.Status
       1 |         3 |        ACTIVE
       2 |         1 |        ACTIVE
       3 |      NULL |          NULL
       4 |      NULL |          NULL

I've achieved this with the following sub select:

SELECT *
FROM House
LEFT JOIN 
(
    SELECT *
    FROM Person
    WHERE Person.Status != "CANCELLED"
) Person
ON House.ID = Person.HouseID

...which works, but breaks all the indexes. Is there a better solution that doesn't?

I'm using MySQL and all relevant columns are indexed. EXPLAIN lists nothing in possible_keys.

Thanks.

+4  A: 

How about:

SELECT *
FROM House
LEFT JOIN Person
ON House.ID = Person.HouseID 
AND Person.Status != "CANCELLED"
Chris Pebble
In my example, that would remove the row where House.ID = 4
aidan
The query above should return all four rows and does in my test db. Give it a shot.
Chris Pebble
Oops, sorry, I misread your answer. I think that'll be perfect!
aidan
Yep, I'm a double idiot for not knowing that, and for not reading your answer properly the first time. Thanks for that!
aidan
+1  A: 

Use:

   SELECT * 
     FROM HOUSE h 
LEFT JOIN PERSON p ON p.houseid = h.id
                  AND p.status = 'ACTIVE'
OMG Ponies
@Frank Shearar: Edit to correct inaccuracies, not for your style preference that tables names only have the initial letter being capitalized.
OMG Ponies
+1  A: 

Do you have control of the database structure? If so, I think you could better represent your data by removing the column Status from the Person table and instead adding a column ActivePersonID to the House table. This way you remove all the redundant CANCELLED values from Person and eliminate application or stored procedure code to ensure only one person per household is active.

In addition, you could then represent your query as

 SELECT * FROM House LEFT JOIN Person ON House.ActivePersonID = Person.ID
Larry Lustig
That's a novel approach. I often forget to "think outside the box" like this. Thanks for that. Even so, I think I'll stick with the solutions mentioned above (I need to keep the link between cancelled people and the house).
aidan
You do not lose the link between the cancelled people and the house, since you maintain the columns Person.ID and Person.HouseID. You simply know that the single Person ID specified in House.ActivePersonID is the active one, all others are cancelled. You have a net decrease in database writes, and a net decrease in storage space, but no net loss in data values stored.
Larry Lustig
Oh, I see! That's even better! Thanks Larry.
aidan
A: 

This is in SQL Server, but the logic seems to work, echoing Chris above:

declare @house table
(
    houseid int
)

declare @person table
(
    personid int,
    houseid int,
    personstatus varchar(20)
)

insert into @house (houseid) VALUES (1)
insert into @house (houseid) VALUES (2)
insert into @house (houseid) VALUES (3)
insert into @house (houseid) VALUES (4)

insert into @person (personid, houseid, personstatus) VALUES (1, 1, 'CANCELLED')
insert into @person (personid, houseid, personstatus) VALUES (2, 1, 'CANCELLED')
insert into @person (personid, houseid, personstatus) VALUES (3, 1, 'ACTIVE')
insert into @person (personid, houseid, personstatus) VALUES (1, 2, 'ACTIVE')
insert into @person (personid, houseid, personstatus) VALUES (4, 4, 'CANCELLED')

select * from @house
select * from @person

select *
from @house h LEFT OUTER JOIN @person p ON h.houseid = p.houseid 
    AND p.personstatus <> 'CANCELLED'
edmicman