views:

232

answers:

2

Have a table users and there is a field invited_by_id showing user id of the person who invited this user. Need to make a MySQL query returning rows with all the fields from users plus a invites_count field showing how many people were invited by each user. Something like this:

SELECT
    User.*, Count.count
FROM
    users AS User,
    (
        SELECT COUNT(*) AS count FROM users WHERE users.invited_by_id=User.id
    ) AS Count;

This one is not working so I need a working one.

+4  A: 

Ok, first of all, count is a reserved word in sql so you can't use it as a table alias (unless you quote it in some way but don't do that). Secondly, the real way to solve this problem is to introduce a GROUP BY clause in your subquery.

Try this:

SELECT user3.*, subquery.theCount FROM
    users AS user3
INNER JOIN ( 
    SELECT
        user1.id, count(user2.id) AS theCount
    FROM
        users AS user1
    LEFT OUTER JOIN
        users AS user2 ON user2.invited_by_id=user1.id
    GROUP BY user1.id
) AS subquery ON subquery.id=user3.id;

Here is a dirty little secret about MySQL: It lets you cheat with the GROUP BY statement and select columns that are not in the GROUP BY list and also not in aggregate functions. Other RMDMSes don't let you do this.

SELECT
    user1.*, count(user2.id) AS theCount
FROM
    users AS user1
LEFT OUTER JOIN
    users AS user2 ON user2.invited_by_id=user1.id
GROUP BY user1.id;
Asaph
+1, nice answer
Peter Lang
Thank you. Seems like magic to me. I don't quite understand the needs and usage of GROUP BY statement.
vian
+3  A: 
SELECT  u.*,
        (
        SELECT  COUNT(*)
        FROM    users ui
        WHERE   ui.invited_by_id = u.id
        ) AS cnt
FROM    users u    
Quassnoi
+1: Yes, subquery in the column list. That certainly works too. Nice answer.
Asaph
This also handles `ORDER BY` and `LIMIT` much more nicely: won't evaluate the subquery until it really needs it.
Quassnoi
@Quassnoi: But will it evaluate the subquery once for each row? If so, this could lead to performance issues in the cases where you don't use `ORDER BY` and `LIMIT`.
Asaph
@Asaph: Yes, it will evaluate the subquery for each row. But `MySQL` is only capable of doing nested loops, so a `LEFT JOIN` will have to do the same plus the overhead of sorting the `GROUP BY` columns prior to aggregating.
Quassnoi
@Quassnoi: Really? I would think that in the case of a `JOIN` on a subquery, MySQL would run the subquery once, store results in a temp table, and then perform the `JOIN` on the temp table. In the `EXPLAIN`, I would expect to see a `using tmp`. Is this not the case?
Asaph
@Asaph: `MySQL` cannot execute the subquery once, since the subquery is correlated. What do you mean by a "`JOIN` on a subquery"?
Quassnoi
@Quassnoi: An example of "a `JOIN` on a subquery" is the first query in my answer to this question. So in that (not correlated) subquery, I would expect the subquery to be executed once and results to be stored in a temp table.
Asaph
@Asaph: Ah. This is not a subquery, this is a nested query. In you case, the nested query will be executed once indeed. When executing it, `MySQL` will need to make the left joins (using the same nested loops algorithm as for executing the scalar subqueries), then sort this resultset, then do the aggregations, then store the results in a temp table.
Quassnoi
@Quassnoi: Thanks for clarifying the terminology. I'm curious, given your query, how would you handle `ORDER BY cnt`?
Asaph
@Asaph: I'd just append `ORDER BY cnt` to the query. `MySQL` allows this.
Quassnoi
@Quassnoi: Cool. Was this always true in MySQL or is it new?
Asaph
@Asaph: ordering by aliases was supported at least since `4.0`
Quassnoi