views:

827

answers:

4

I have a table that contains a StudyId, a PatientId, and a StudyStartDateTime. I'd like to graph the totals of the Studies and Patients between two dates specified by the user. The problem is with counting distinct values. Here is the query:

SELECT
    s.StudyStartDateTime,
    COUNT(s.StudyId),
    COUNT(s.PatientId)
FROM
    dbo_Study_ViewX211_Rpt AS s
WHERE
    s.StudyStartDateTime>=Forms![StudiesPatientsByDate]!txtStartDate,
    s.StudyStartDateTime<=Forms![StudiesPatientsByDate]!txtEndDate
GROUP BY s.StudyStartDateTime
ORDER BY s.StudyStartDateTime;

This query works almost as it should, except that it counts duplicates of rows with the same StudyId or the same PatientId. I know that Access doesn't support COUNT(DISTINCT...), but I'm having a lot of trouble working around this. Any help would be very appreciated.

+1  A: 

You could try doing it with subqueries for the counts, but correlated sub-queries tend to bite when it comes to performance.

If you are open to doing this in two queries instead of one, these would work:

SELECT
    s.StudyStartDateTime,
    COUNT(s.PatientId)
FROM
    dbo_Study_ViewX211_Rpt AS s
WHERE
    s.StudyStartDateTime>=Forms![StudiesPatientsByDate]!txtStartDate,
    s.StudyStartDateTime<=Forms![StudiesPatientsByDate]!txtEndDate
GROUP BY s.StudyStartDateTime, s.PatientId
ORDER BY s.StudyStartDateTime;


SELECT
    s.StudyStartDateTime,
    COUNT(s.StudyId),
FROM
    dbo_Study_ViewX211_Rpt AS s
WHERE
    s.StudyStartDateTime>=Forms![StudiesPatientsByDate]!txtStartDate,
    s.StudyStartDateTime<=Forms![StudiesPatientsByDate]!txtEndDate
GROUP BY s.StudyStartDateTime, s.StudyId
ORDER BY s.StudyStartDateTime;

Note that I added the counted fields to the GROUP BY expressions in each.

If you want to make it more "compact" you could create a view for each of these queries and join them to a distinct query on StudyStartDateTime to get the results all in one resultset.

JohnFx
But how does this count distinct rows?
Richard
Distinct rows? Your question asks for distinct values in PatiendID and StudyID. Distinct rows is a whole other issue.
JohnFx
+1  A: 

Noticed the dbo_ prefix - is this linked to a SQL Server database?

If so you can use a pass-through query and use the COUNT(DISTINCT ...) syntax as it will passed directly to SQL Server.

DJ
It is, but unfortunately it also needs to work on Access-only copies, as well.
Richard
A: 

Putting this in a separate answer so it can be voted on independently, but this blog article talks about doing this with the subquery approach:

Writing a Count(Distinct) Query in Access

JohnFx
This was the first article I found, as well. So far I've been unable to incorporate this approach. I think the trouble arises from the fact that while he only has one value that has to be distinct, I have two.
Richard
To add to my previous comment, the problem is that since he only has the one column that he needs to be distinct, he doesn't have to worry about joining that two a separate SELECT DISTINCT query.
Richard
The trick is going to be to join in two derrived tables (subqueries) instead of one. One for each count (essentially the two queries I put in my other answer)
JohnFx
A: 

I've taken JohnFx's suggestion, and I've created these two subqueries:

numStudiesByDate:

SELECT
    t.StudyStartDateTime,
    COUNT(s.StudyId) AS numStudies
FROM
    (SELECT DISTINCT
        StudyId
    FROM
        dbo_Study_ViewX211_Rpt
    GROUP BY StudyId) AS s
INNER JOIN
    dbo_Study_ViewX211_Rpt AS t
ON t.StudyId=s.StudyId
WHERE
    t.StudyStartDateTime>=Forms![StudiesPatientsByDate]!txtStartDate,
    t.StudyStartDateTime<=Forms![StudiesPatientsByDate]!txtEndDate
GROUP BY t.StudyStartDateTime
ORDER BY t.StudyStartDateTime;

numPatientsByDate:

SELECT
    t.StudyStartDateTime,
    COUNT(s.PatientId) AS numPatients
FROM
    (SELECT DISTINCT
        PatientId
    FROM
        dbo_Study_ViewX211_Rpt
    GROUP BY PatientId) AS s
INNER JOIN
    dbo_Study_ViewX211_Rpt AS t
ON t.PatientId=s.PatientId
WHERE
    t.StudyStartDateTime>=Forms![StudiesPatientsByDate]!txtStartDate,
    t.StudyStartDateTime<=Forms![StudiesPatientsByDate]!txtEndDate
GROUP BY t.StudyStartDateTime
ORDER BY t.StudyStartDateTime;

And the final query:

numStudiesPatientsByDate:

SELECT
    s.StudyStartDateTime,
    s.numStudies,
    p.numPatients
FROM
    numStudiesByDate AS s
INNER JOIN
    numPatientsByDate AS p
ON
    s.StudyStartDateTime = p.StudyStartDateTime;

Thanks for all the help, and hopefully someone else finds this useful!

Richard
Even though you've altered his suggestion, shouldn't you give him credit for the answer?
David-W-Fenton