views:

74

answers:

3

I have the following query:

select r.people_code_id [People Code ID], r.resident_commuter [Campus6],
c1.udormcom [AG], aR.RESIDENT_COMMUTER [AG Bridge], ar.ACADEMIC_SESSION,
ar.ACADEMIC_TERM, ar.academic_year, ar.revision_date
from RESIDENCY r
left join AG_Common..CONTACT1 c1 on r.PEOPLE_CODE_ID=c1.key4
left join AG_Common..CONTACT2 c2 on c1.ACCOUNTNO=c2.accountno
left join AGPCBridge..ArchiveRESIDENCY aR on r.PEOPLE_CODE_ID=aR.PEOPLE_CODE_ID
where r.ACADEMIC_YEAR='2010'
and r.ACADEMIC_TERM='Fall' 
and SUBSTRING(c1.udormcom,1,1)<>r.resident_commuter
and r.ACADEMIC_SESSION='Und 01'
and aR.ACADEMIC_SESSION='Und 01'
and aR.ACADEMIC_TERM='Fall'
and aR.ACADEMIC_YEAR='2010'
and SUBSTRING(c1.udormcom,1,1)=aR.RESIDENT_COMMUTER

I need to add another clause to the where segment. I have this query:

 select DISTINCT * from RESIDENCY where ACADEMIC_YEAR='2010' and
 ACADEMIC_TERM='Fall' and ACADEMIC_SESSION='Und 01' ORDER BY revision_date DESC

This gets only the latest row for each individual. I want to do something like (pseudo-code):

WHERE r.people_code_id and r.revision_date are in (select DISTINCT * from
RESIDENCY where ACADEMIC_YEAR='2010' and ACADEMIC_TERM='Fall' and
ACADEMIC_SESSION='Und 01' ORDER BY revision_date DESC)

I am running in SQL 2000 Compatibility mode (though it is actually running SQL 2008).

+1  A: 

You could use an EXISTS with a subquery

select 
    r.people_code_id [People Code ID], 
    r.resident_commuter [Campus6],
    udormcom [AG], 
    aR.RESIDENT_COMMUTER [AG Bridge], 
    ar.ACADEMIC_SESSION,
    ar.ACADEMIC_TERM, 
    ar.academic_year, 
    ar.revision_date
from RESIDENCY r
    left join AG_Common..CONTACT1 c1 
        on r.PEOPLE_CODE_ID=c1.key4
    left join AG_Common..CONTACT2 c2 
        on c1.ACCOUNTNO=c2.accountno
    left join AGPCBridge..ArchiveRESIDENCY aR 
        on r.PEOPLE_CODE_ID=aR.PEOPLE_CODE_ID
where r.ACADEMIC_YEAR='2010'
    and r.ACADEMIC_TERM='Fall' 
    and SUBSTRING(udormcom,1,1)<>r.resident_commuter
    and r.ACADEMIC_SESSION='Und 01'
    and aR.ACADEMIC_SESSION='Und 01'
    and aR.ACADEMIC_TERM='Fall'
    and aR.ACADEMIC_YEAR='2010'
    and SUBSTRING(udormcom,1,1)=aR.RESIDENT_COMMUTER
    and EXISTS 
    (
        select 1
    FROM RESIDENCY r2
    where 1=1
        and r2.revision_date  = ar.revision_date  /* note the join here */
            and ACADEMIC_YEAR='2010' 
            and ACADEMIC_TERM='Fall' 
        and ACADEMIC_SESSION='Und 01'

        /* the order by has been removed */
    )
Raj More
Don't I need the order by? For example, I might have three records in residency for the same individual - with the same year, term, and session. The only way to tell which one is the newest is by revision_date - that is why I had the ORDER BY DESC (to order the records from newest to oldest) and why I had the DISTINCT (to only pull the latest).
davemackey
A: 
WHERE r.people_code_id in (select DISTINCT people_code_id
                           from RESIDENCY 
                           where ACADEMIC_YEAR='2010' 
                             and ACADEMIC_TERM='Fall'
                             and ACADEMIC_SESSION='Und 01' 
                             and revision_date = r.revision_date
                           ORDER BY revision_date DESC)

I don't think you need the 'distinct' or the 'order by'. Removing those should improve performance.

Kenneth
I need to match on both people_code_id and the revision_date, thus the ORDER BY DESC and DISTINCT. It ensures that I have only the latest record for each individual included.
davemackey
+2  A: 

I re-wrote your query, based on what you wanted to add:

WITH residency_cte AS (
     SELECT TOP (1)
            r.people_code_id, 
            r.resident_commuter,
            r.academic_year,
            r.academic_term,
            r.academic_session
       FROM RESIDENCY r
      WHERE r.academic_year = '2010'
        AND r.academic_term = 'Fall' 
        AND r.academic_session = 'Und 01'
   ORDER BY revision_date DESC)
   SELECT r.people_code_id, 
          r.resident_commuter [Campus6],
          c1.udormcom [AG], 
          aR.RESIDENT_COMMUTER,
          ar.ACADEMIC_SESSION,
          ar.ACADEMIC_TERM, 
          ar.academic_year, 
          ar.revision_date
     FROM residency_cte r
LEFT JOIN AG_Common..CONTACT1 c1 ON c1.key4 = r.PEOPLE_CODE_ID
                                AND SUBSTRING(c1.udormcom, 1, 1) != r.resident_commuter
LEFT JOIN AG_Common..CONTACT2 c2 ON c2.accountno = c1.ACCOUNTNO
LEFT JOIN AGPCBridge..ArchiveRESIDENCY aR ON aR.PEOPLE_CODE_ID = r.PEOPLE_CODE_ID
                                         AND aR.ACADEMIC_SESSION = r.academic_session
                                         AND aR.ACADEMIC_TERM = r.academic_term
                                         AND aR.ACADEMIC_YEAR = r.academic_year
                                         AND SUBSTRING(c1.udormcom, 1, 1) = aR.RESIDENT_COMMUTER

Only thing is the udormcom column location - once I know what table it's from, I'd move the clause up into the joins. I also updated the joins to the ArchiveRESIDENCY table, so you only need to tweak the dates in one place.

But be aware that using a substring to match on another column will never perform well - until the data model changes to correct that, this will never be truly optimized.

OMG Ponies