tags:

views:

146

answers:

6

I have 2 related tables:

messages
--------

mid subject
--- -----------------
1   Hello world
2   Bye world
3   The third message
4   Last one


properties
----------

pid mid name             value
--- --- ---------------- ----------- 
1   1   read             false
2   1   importance       high
3   2   read             false
4   2   importance       low
5   3   read             true
6   3   importance       low
7   4   read             false
8   4   importance       high

And I need to get from messages using the criteria on the properties table. Eg: if I have a criteria like return unread (read=false) high prio (importance=high) messages it should return

mid subject
--- -----------------
1   Hello world
4   Last one

How could I get this with a SELECT clause (MySQL dialect)?

+5  A: 

I believe the query below will work.
UPDATE: @Gratzy is right, this query won't work, take a look at the structure changes I suggested.

SELECT DISTINCT m.id as mid, m.subject
FROM message as m
INNER JOIN properties as p
ON m.mid = p.mid
where (p.name = 'read' and p.value = 'false') or (p.name = 'importance' AND p.value = 'high')

The structure of your properties table seems a little off to me though...

Would it be possible to structure the table like this:

messages
--------

mid subject           Read      Importance
--- ----------------- --------- ------------
1   Hello world       false     3
2   Bye world         false     1
3   The third message true      1
4   Last one          false     3

importance
----------

iid importanceName
--- --------------
1   low
2   medium
3   high

and use this query:

SELECT m.id as mid, m.subject
FROM message as m
where m.read = false AND m.importance = 3
Abe Miessler
Added DISTINCT, or you'll get duplicates. Also, you can move some of the WHERE stuff up into the JOIN
OMG Ponies
This would also get you MID 2 (read false but imporance low) which I don't think he wants
Gratzy
A: 

Clearly, you are using an EAV (Entity-Attribute-Value) schema. One of the many reasons for avoiding such a structure is that it makes queries more difficult. However, for the example you gave, you could do something like:

Select ...
From messages As M
Where Exists    (
                Select 1
                From Properties As P1
                Where P1.mid = M.mid
                    And P1.name = 'unread' And P1.value = 'false'
                )
    And Exists  (
                Select 1
                From Properties As P2
                Where P2.mid = M.mid
                    And P2.name = 'importance' And P2.value = 'high'
                )

A more succinct solution would be:

Select ...
From messages As M
Where Exists    (
                Select 1
                From Properties As P1
                Where P1.mid = M.mid
                    And ((P1.name = 'unread' And P1.value = 'false')
                            Or (P1.name = 'importance' And P1.value = 'high'))
                Having Count(*) = 2
                )
Thomas
It also can kill performance. EAV is to be avoided as much as poosible.
HLGEM
This query returns mid 2, which is not wanted.
Bill Karwin
@Bill Karwin - Misread the OP request which did not specify whether he want (A and B) or (A or B). However, the sample results do make this clear and I've adjusted my solution.
Thomas
+5  A: 

In SQL, any expression in a WHERE clause can only reference one row at a time. So you need some way of getting multiple rows from your properties table onto one row of result. You do this with self-joins:

SELECT ...
FROM messages AS m
JOIN properties AS pRead 
    ON m.mid = pRead.mid AND pRead.name = 'read'
JOIN properties AS pImportance 
    ON m.mid = pImportance.mid AND pImportance.name = 'importance'
WHERE pRead.value = 'false' AND pImportance.value = 'high';

This shows how awkward it is to use the EAV antipattern. Compare with using conventional attributes, where one attribute belongs in one column:

SELECT ...
FROM messages AS m
WHERE m.read = 'false' AND m.importance = 'high';

By the way, both answers from @Abe Miessler and @Thomas match more mid's than you want. They match all mid's where read=false OR where importance=high. You need to combine these properties with the equivalent of AND.

Bill Karwin
I managed to make it work with: SELECT DISTINCT m.mid, m.subject FROM messages AS m INNER JOIN properties AS p1 ON m.mid = p1.mid AND p1.value = 'importance' AND p1.value = 'high' INNER JOIN properties AS p2 ON m.mid = p2.mid AND p2.name = 'read' AND p2.value = 'false'And thank you all for the information on EAV.
josuegomes
A: 
Select m.mid, m.subject
from properties p 
inner join properties p1 on p.mid = p1.mid
inner join messages m on p.mid = m.mid
where
p.name = 'read' 
and p.value = 'false'
and p1.name = 'importance'
and p2.value = 'high'

I prefer to put my filter criteria in the where clause and leave my join's to elements that are in both tables and are the actual criteria for the join.

Gratzy
A: 

Another way might be (untested) to use a derived table to hold the criteria that all messages must meet then use the standard relational division technique of double NOT EXISTS

SELECT mid,
       subject
FROM   messages m
WHERE  NOT EXISTS
       ( SELECT *
       FROM    ( SELECT 'read' AS name,
                       'false' AS value

               UNION ALL

               SELECT 'importance' AS name,
                      'high'       AS value
               )
               c
       WHERE   NOT EXISTS
               (SELECT *
               FROM    properties P
               WHERE   p.mid  = m.mid
               AND     p.name =c.name
               AND     p.value=c.value
               )
       )
Martin Smith
A: 

If you want to keep your existing data model, then go with Bill Karwin's first suggestion. Run it with this select clause to understand what it's doing:

select m.*, r.value as read, i.value as importance
from message m
join properties r
    on r.mid = m.mid and r.name = 'read'
join properties i
    on i.mid = m.mid and i.name = 'importance'
where r.value = 'false' and i.value = 'high';

But if you go this way, there are a few constraints you should put in place to avoid storing and retrieving bad data:

  1. A unique index on message(mid) and a unique index on properties(pid), both of which I'm sure you have already.
  2. A unique index on properties(mid, name) so that each property can only be defined once for a message -- otherwise you may get duplicate results from your query. This will also help your query performance by allowing an index access for both joins.
Mike M. Lin