views:

233

answers:

2

Simplified table structures, all INT columns and no PKs outside of the identity columns:

Nodes (n) table: id

Attributes (a) table: id, node_id, type_id

Type (t) table: id, priority

I'm trying to select a set of attributes, each of which has the lowest type.priority for its respective node. Though there are multiple attributes per node_id, I only want to select the one with the lowest priority value:

a1 n1 t1 p0 *
a2 n1 t2 p1 
a3 n2 t2 p1 *
a4 n2 t3 p2

This is the basic query that I'm working from, at which point I'm also getting stuck:

   SELECT * 
     FROM a 
LEFT JOIN t ON a.type_id = t.id 
 GROUP BY node_id

My first thought was to use an aggregate, MIN, but I'm then having problems matching up the lowest priority for a node_id with the correct attribute.

+1  A: 

Use tie-breaker query (not tested):

SELECT      n.*, a.*
FROM        Nodes n
LEFT JOIN   Attributes a
        ON  a.id = (SELECT      x.id --//TOP 1 x.id
                    FROM        Attributes x
                    INNER JOIN  Type t
                            ON  x.type_id = t.id
                    WHERE       x.node_id = n.id
                    ORDER BY    t.priority ASC,
                                --//just in case there are 2 attributes 
                                --//with the same priority, order also on x.id
                                x.id ASC
                    LIMIT 1
                    )
van
@Van: This is MySQL - shouldn't "`TOP 1`" be "`LIMIT 1`"?
OMG Ponies
@rexem: thanks, changed - hope it is correct
van
+1  A: 

This question is a variation of the "greatest-n-per-group" problem, but you're looking for the least instead of the greatest, and your criteria are in a lookup table (Type) instead of the principle table (Attributes).

So you want the rows (a1) from Attributes such that no other row with the same node_id is associated with a lower priority.

SELECT a1.*
FROM Attributes a1 INNER JOIN Type t1 ON (a1.type_id = t1.id)
LEFT OUTER JOIN (
  (Attributes a2 INNER JOIN Type t2 ON (a2.type_id = t2.id))
  ON (a1.node_id = a2.node_id AND t1.priority > t2.priority)
WHERE a2.node_id IS NULL;

Note that this can result in ties. You haven't described how you would resolve ties if two Attributes referenced Types with the same priority. In other words, in the following examples, which attributes should be selected?

a1 n1 t1 p0 
a2 n1 t1 p0 
a3 n2 t2 p1 
a4 n2 t3 p1

PS: I hope you don't mind I added the "greatest-n-per-group" tag to your question. Click that tag to see other questions on SO that I have tagged similarly.

Bill Karwin
Thanks Bill, this was precisely what I was looking for! I figured this particular problem had a name, but I couldn't figure out what it was. Using your example query, I built a query that seems to be selecting exactly what I want (my actual tables more complex than the example ones that I provided).Your observation about ties is also correct. Fortunately, it isn't a lapse in design, but something I plan to prevent from happening, though I've yet to do so. For now, picking either of the tied rows will suffice.
Stiggler
Okay, great I'm glad I could help. Just so you know, this solution returns *both* rows in cases where there are ties.
Bill Karwin