Hi everyone,
I'm looking for a way to select the most occurring value, e.g. the person who posted most for each thread;
SELECT MOST_OCCURRING(user_id) FROM thread_posts GROUP BY thread_id
Is there a good way to do this?
Hi everyone,
I'm looking for a way to select the most occurring value, e.g. the person who posted most for each thread;
SELECT MOST_OCCURRING(user_id) FROM thread_posts GROUP BY thread_id
Is there a good way to do this?
If you want a count on a per thread basis, I think you can use a nested query; grouping by thread first and then by user:
SELECT thread_id AS tid,
(SELECT user_id FROM thread_posts
WHERE thread_id = tid
GROUP BY user_id
ORDER BY COUNT(*) DESC
LIMIT 0,1) AS topUser
FROM thread_posts
GROUP BY thread_id
There's numerous examples if you check the questions under the "greatest n per group" tag. But in this case, you don't define how you want to handle ties - what if two or more users have the same count value?
SELECT DISTINCT
tp.thread_id,
tp.user_id
FROM THREAD_POSTS tp
JOIN (SELECT t.thread_id,
t.user_id,
COUNT(t.user_id) AS occurrence,
CASE
WHEN @thread != t.thread_id THEN @rownum := 1
ELSE @rownum := @rownum + 1
END AS rank,
@thread := t.thread_id
FROM THREAD_POSTS t
JOIN (SELECT @rownum := 0, @thread := -1) r
GROUP BY t.thread_id, t.user_id
ORDER BY t.thread_id, occurrence DESC) x ON x.thread_id = tp.thread_id
AND x.user_id = tp.user_id
AND x.rank = 1
This will tabulate the occurrences of user_id per thread
SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id
But you only wish to select the top user for each thread
SELECT thread_id, user_id, postings
FROM (
SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id
)
HAVING postings = max(postings)
which is equivalent to
SELECT thread_id, user_id, COUNT(*) as postings
FROM thread_posts
GROUP BY thread_id, user_id
HAVING postings = max(postings)
The HAVING keyword is usually used with an aggregation operation to cherry-pick the aggregated output lines that satisfy the conditions in the HAVING clause.
The HAVING clause is different from the the WHERE clause, wherein the HAVING clause filters resultant output of a query. Whereas, the WHERE clause filters on the input data of a query. Since theHAVING clause filters the resultant output of a query, it must appear after the ORDER BY and GROUP BY clauses.