views:

389

answers:

5

In Mysql, I want to select the bottom 2 items from each category

Category Value
1        1.3
1        4.8
1        3.7
1        1.6
2        9.5
2        9.9
2        9.2
2        10.3
3        4
3        8
3        16

Giving me:

Category Value
1        1.3
1        1.6
2        9.5
2        9.2
3        4
3        8

Before I migrated from sqlite3 I had to first select a lowest from each category, then excluding anything that joined to that, I had to again select the lowest from each category. Then anything equal to that new lowest or less in a category won. This would also pick more than 2 in case of a tie, which was annoying... It also had a really long runtime.

My ultimate goal is to count the number of times an individual is in one of the lowest 2 of a category (there is also a name field) and this is the one part I don't know how to do. Thanks

+2  A: 

You could try this:

SELECT * FROM (
  SELECT c.*,
        (SELECT COUNT(*)
         FROM user_category c2
         WHERE c2.category = c.category
         AND c2.value < c.value) cnt
  FROM user_category c ) uc
WHERE cnt < 2

It should give you the desired results, but check if performance is ok.

Peter Lang
That doesn't work. It's returning 9.2 and 10.3 for category 2.
Peter Bailey
Sorry to hear that. I tried it and it works for me. Could you please check if your test-data is correct? Thanks!
Peter Lang
Yes, I entered it exactly as it appears above, in the same order and all. The values for category 1 come back correct, (1.3 and 1.6) but for category 2 it's wrong, and for 2 as well (returns 4 and 16). Also, this query doesn't even execute until you give an alias to the first sub-select.
Peter Bailey
I added the alias, thanks. But I can not reproduce the result you describe, and I do not see how that could happen with my query... Do you seen any mistake in the query?
Peter Lang
My fault - this is a problem with my data. I had defined `value` as `varchar`. With a non-string column-type, it works. Apologies.
Peter Bailey
I should've explained Ties better, but this is exactly how I wanted to handle it... Only count them if there is at most 1 person of strictly less value. Also, your method lets me use the other fields in my table, where as some of the other methods are just group bys leaving me one more join to get at the name field. Thanks!
Dan
+1  A: 

A union should work. I'm not sure of the performance compared to Peter's solution.

SELECT smallest.category, MIN(smallest.value)
    FROM categories smallest
GROUP BY smallest.category
UNION
SELECT second_smallest.category, MIN(second_smallest.value)
    FROM categories second_smallest
    WHERE second_smallest.value  > (SELECT MIN(smallest.value) FROM categories smallest WHERE second.category = second_smallest.category)
GROUP BY second_smallest.category
sig11
There's a typo in the where clause of the sub-select, should be "WHERE smallest.category = second_smallest.category".
Ike Walker
Also, this will not give the correct results if there is a tie for the smallest value in a given category.
Ike Walker
To get rid of ties, just add DISTINCT?
Sander Rijken
Good call on the failure with ties. I misread his explanation of how to handle that. Sander: On a tie the query should return two identical rows in the case of this really simple example.
sig11
+4  A: 
SELECT c1.category, c1.value
FROM catvals c1
LEFT OUTER JOIN catvals c2
  ON (c1.category = c2.category AND c1.value > c2.value)
GROUP BY c1.category, c1.value
HAVING COUNT(*) < 2;

Tested on MySQL 5.1.41 with your test data. Output:

+----------+-------+
| category | value |
+----------+-------+
|        1 |  1.30 |
|        1 |  1.60 |
|        2 |  9.20 |
|        2 |  9.50 |
|        3 |  4.00 |
|        3 |  8.00 |
+----------+-------+

(The extra decimal places are because I declared the value column as NUMERIC(9,2).)

Like other solutions, this produces more than 2 rows per category if there are ties. There are ways to construct the join condition to resolve that, but we'd need to use a primary key or unique key in your table, and we'd also have to know how you intend ties to be resolved.

Bill Karwin
this is great! exactly what i was looking for! thank's!
aSeptik
+1  A: 

Here is a very generalized solution, that would work for selecting first n rows for each Category. This will work even if there are duplicates in value.

/* creating temporary variables */
mysql> set @cnt = 0;
mysql> set @trk = 0;

/* query */
mysql> select Category, Value 
       from (select *, 
                @cnt:=if(@trk = Category, @cnt+1, 0) cnt, 
                @trk:=Category 
                from user_categories 
                order by Category, Value ) c1 
       where c1.cnt < 2;

Here is the result.

+----------+-------+
| Category | Value |
+----------+-------+
|        1 |   1.3 |
|        1 |   1.6 |
|        2 |   9.2 |
|        2 |   9.5 |
|        3 |     4 |
|        3 |     8 |
+----------+-------+

This is tested on MySQL 5.0.88 Note that initial value of @trk variable should be not the least value of Category field.

Harun Prasad
+1  A: 

Here's a solution that handles duplicates properly. Table name is 'zzz' and columns are int and float

select
    smallest.category category, min(smallest.value) value
from 
    zzz smallest
group by smallest.category

union

select
    second_smallest.category category, min(second_smallest.value) value
from
    zzz second_smallest
where
    concat(second_smallest.category,'x',second_smallest.value)
    not in ( -- recreate the results from the first half of the union
     select concat(c.category,'x',min(c.value))
     from zzz c
     group by c.category
    )
group by second_smallest.category

order by category

Caveats:

  • If there is only one value for a given category, then only that single entry is returned.
  • If there was a unique recordID for each row you wouldn't need all the concats to simulate a unique key.

Your mileage may vary,

--Mark

Mark Moline