views:

164

answers:

2

Problem: I have a list of names and addresses. Some names (persons) have the same address (street, zip code, town) like others. I want to select all those names with addresses with no more than three occurrences and from the rest the first three names each of a bunch pointing to the same address. Example:

Albert | Adr1
Berta  | Adr1
Cesar  | Adr1
Donald | Adr1
Eric   | Adr2
Fritz  | Adr2
Gerd   | Adr2
Henry  | Adr3

The result set should be

Albert | Adr1 
Berta  | Adr1
Cesar  | Adr1
Eric   | Adr2
Fritz  | Adr2
Gerd   | Adr2
Henry  | Adr3

Donald is missing because he is the 4th of a group with the same address. Can this result be achieved with UNIONs and subqueries? Something like

select * from addresses where address in 
(select address from addresses group by address having count(address) <= 3)
UNION
select * from addresses where address in 
(select address from addresses group by address having count(address) > 3 limit 3)

I know that this query is wrong because it limits the complete result set of addresses with more than 3 occurences. I wonder if this can be done in a single SELECT with UNIONs and subqueries. I will do it now procedurally with PHP/MySQL, but just for fun would be interested in an SQL only solution.

I had a look at http://stackoverflow.com/questions/2084893, but this does not reflect my situation - or does it?

+2  A: 
select name, address
from
 ( select *, row_number() over (partition by address order by name) as namenum
   from yourTable
 ) t
where namenum <= 3;
Rob Farley
Is this SQL standard?
Adam Matan
This works in SQL Server, and Oracle and DB2 both support the OVER clause as well. I don't think mysql has caught up with this part of the standard yet, just like how SQL Server doesn't require semi-colons yet.
Rob Farley
This is not supported by MySQL. The solution pointed me to http://explainextended.com/2009/09/14/mysql-emulating-row_number-with-multiple-order-by-conditions/, though, where a similar problem is treated and the functionality of Robs solution is emulated.
Alex Monthy
+1  A: 

You can try something like

SELECT  PersonName,
        Address
FROM    (
            SELECT  *,
                    (SELECT COUNT(1) FROM addresses WHERE Address = a.Address AND PersonName < a.PersonName) CountLess
            FROM    addresses a
        ) sub
WHERE   sub.CountLess <= 2
astander
Very ingenious. Thanks a lot.
Alex Monthy