views:

61

answers:

4

I have a table of users containing the following columns:

| User_ID  (int) | Name (varchar) |   Age (int)  |  Experience_Level (int) |

I would like to create an sql query to output all of the IDs of people who are not unique in the combination of age and experience.

My code so far:

SELECT Count(*), User_ID FROM Users 
GROUP BY Age,Experience_Level
HAVING Count(*) > 1

Obviously this is incomplete, This will group the non-unique users but it will not tell me all of their User_IDs.

Thanks in advance for your help!

+2  A: 

Since you want information about groups of multiple users, how do you want this data returned? In a string containing comma-separated list of user_id values?

You didn't tag your question with the brand of SQL database you use.

If you use MySQL or SQLite, you can use the built-in GROUP_CONCAT() function:

SELECT Count(*), GROUP_CONCAT(User_ID) AS User_List FROM Users 
GROUP BY Age,Experience_Level
HAVING Count(*) > 1

By default, GROUP_CONCAT() separates the values with commas. Refer to the manual if you want it formatted another way.

There are other solutions for other SQL vendors. This question has come up many times on Stack Overflow:

Bill Karwin
OMG Ponies
@OMG Ponies: Assuming `GROUP_CONCAT()` is an aggregate (and it wouldn't make much sense otherwise), there's no reason it wouldn't work on other RDBMS's (assuming it, or something similar, exists).
Adam Robinson
Sorry I should have specified, I am using MySql. I've never heard of Group_Concat but the query works. I ended up using David B's answer for separation purposes but I learned something from this too! Thanks!
pws5068
OMG Ponies
+1  A: 
SELECT t.User_ID, t.Age, t.Experience_Level
FROM Users t INNER JOIN 
    (SELECT Age, Experience_Level
    FROM Users
    GROUP BY Age, Experience_Level
    HAVING Count(*) > 1) d ON t.Age = d.Age AND t.Experience_Level = d.Experience_Level

Test script:

create table Users (
User_ID int,
Name varchar(50),
Age int,
Experience_Level int
)

insert into Users (User_ID, Name, Age, Experience_Level) values (1, 'A', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (2, 'B', 37, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (3, 'C', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (4, 'D', 35, 2)
insert into Users (User_ID, Name, Age, Experience_Level) values (5, 'E', 33, 1)
insert into Users (User_ID, Name, Age, Experience_Level) values (6, 'F', 35, 2)
insert into Users (User_ID, Name, Age, Experience_Level) values (7, 'G', 18, 1)
Tahbaza
+1  A: 

In theory, what you want is something like this, but unfortunately SQL Server doesn't allow it:

SELECT * FROM Users 
WHERE (Age, Experience_Level) IN
(
    SELECT Age, Experience_Level
    FROM Users  
    GROUP BY Age,Experience_Level 
    HAVING Count(*) > 1 
)

So instead, you have to settle for joining to a subquery:

SELECT Users.* FROM Users 
INNER JOIN 
(
    SELECT Age, Experience_Level
    FROM Users  
    GROUP BY Age,Experience_Level 
    HAVING Count(*) > 1 
) subq 
    ON Users.Age = subq.Age
    AND Users.Experience_Level = subq.Experience_Level
Mike Mooney
+3  A: 

Here's a negated logic query:

SELECT *
FROM Users
WHERE UserID not in
(
SELECT MIN(UserID)
FROM Users
GROUP BY Age, Experience_Level
HAVING COUNT(*) = 1
)
David B
Exactly what I was looking for, Many thanks to you sir.
pws5068