tags:

views:

29

answers:

2

Hello,

I'm working on my first assignment using SQL on our class' PostgreSQL server. A sample database has the (partial here) schema:

CREATE TABLE users (
  id int PRIMARY KEY,
  userStatus varchar(100),
  userType varchar(100),
  userName varchar(100),
  email varchar(100),
  age int,
  street varchar(100),
  city varchar(100),
  state varchar(100),
  zip varchar(100),
  CONSTRAINT users_status_fk FOREIGN KEY (userStatus) REFERENCES userStatus(name),
  CONSTRAINT users_types_fk FOREIGN KEY (userType)  REFERENCES userTypes(name)
);

CREATE TABLE events (
  id int primary key,
  title varchar(100),
  edate date,
  etime time,
  location varchar(100),
  user_id int,    -- creator of the event
  CONSTRAINT events_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);

CREATE TABLE polls (
  id int PRIMARY KEY,
  question varchar(100),
  creationDate date,
  user_id int,  --creator of the poll
  CONSTRAINT polls_user_fk FOREIGN KEY (user_id) REFERENCES users(id)
);

and a bunch of sample data (in particular, 127 sample users).

I have to write a query to find the number of polls created by a user within the past year, as well as the number of events created by a user that occurred in the past year. The trick is, I should have rows with 0s for both columns if the user had no such polls/events.

I have a query which seems to return the correct data, but only for 116 of the 127 users, and I cannot understand why the query is trimming these 11 users, when the WHERE clause only checks attributes of the poll/event. Following is my query:

SELECT u.id, u.userStatus, u.userType, u.email, -- Return user details
       COUNT(DISTINCT e.id) AS NumEvents, -- Count number of events
       COUNT(DISTINCT p.id) AS NumPolls -- Count number of polls
FROM (users AS u LEFT JOIN events AS e ON u.id = e.user_id) LEFT JOIN polls AS p ON u.id = p.user_id
WHERE (p.creationDate IS NULL OR ((now() - p.creationDate) < INTERVAL '1' YEAR) OR -- Only get polls created within last year
      e.edate IS NULL OR ((now() - e.edate) < INTERVAL '1' YEAR)) -- Only get events that happened during last year
GROUP BY u.id, u.userStatus, u.userType, u.email;

Any help would be much appreciated.

A: 

Try to avoid using DISTINCT with sub-queries for example.

pukipuki
Removing the DISTINCT keyword from the COUNT aggregate did not change anything, it seems (except that the output was sorted by id, and now is unsorted).
Ganon11
A: 

Using a different query seemed to work. Here's what I ended up with:

SELECT u.id, u.userStatus, u.userType, u.email, COUNT(DISTINCT e.id) AS numevents, COUNT(DISTINCT p.id) AS numpolls
FROM users AS u LEFT OUTER JOIN (SELECT * FROM events WHERE ((now() - edate) < INTERVAL '1' YEAR)) AS e ON u.id = e.user_id
     LEFT OUTER JOIN (SELECT * FROM polls WHERE ((now() - creationDate) < INTERVAL '1' YEAR)) AS p ON u.id = p.user_id
GROUP BY u.id, u.userStatus, u.userType, u.email
;
Ganon11