Continuing on my last question...
Let me try to explain my schema. I have three tables we'll call users (with columns id and name), parties (with columns id, partydate, and user_id) and questions (with columns id, createdate, and user_id). My requirement is to show for every user the number of parties within the last year and questions created within the last year.
My query looks like:
SELECT users.id, users.name,
SUM(CASE WHEN (parties.party> NOW() - interval '1 year') THEN 1 ELSE 0 END)
AS numparties,
SUM(CASE WHEN (questions.createdate> NOW() - interval '1 year') THEN 1 ELSE 0 END)
AS numquestions
FROM users
LEFT JOIN parties ON users.id=parties.user_id
LEFT JOIN questions ON users.id=questions.user_id
GROUP BY users.id, users.name;
This works almost 100%. I am getting a result with all users that exist. The problem is, for some users (a very small few) I'm counting either a party or a question twice. For example, if I change the above query to just show parties.id and questions.id instead of summing them as well as remove the GROUP BY, I might get something like:
user.id | user.name | parties.id | questions.id
-----------------------------------------------
0 John 15 2
0 John 15 7
You can see it shows the parties.id twice.
When I was using COUNT() I could rely on DISTINCT but with SUM I'm not sure how I can. I want something like:
SUM(CASE WHEN (parties.party> NOW() - interval '1 year' AND parties.id IS DISTINCT) THEN 1 ELSE 0 END)
AS numparties,
But of course this isn't valid. Can this small problem be corrected easily?