tags:

views:

66

answers:

2

I have the following query. What is strange is that it is returning multiple records for the same individual - but it should be returning just one row for each individual. It is all LEFT JOINS based on CONTACT1 C - which has only one row for each individual, unlike the other columns which sometimes have multiple rows for the same individual.

    select 
 C.ACCOUNTNO as 'AdmitGold Account',
 C2.UNAMEFIRST as 'First Name',
 C2.UNAMELAST as 'Last Name',
 C.KEY1 as 'Status',
 C.KEY4 as 'People ID',
 C.KEY3 as 'Type',
 C.KEY5 as 'Counselor',
 C.CITY as 'City',
 C.STATE as 'State',
 C.SOURCE as 'Source',
 C.DEPARTMENT as 'Major',
 C2.UGENDER as 'Gender',
 C2.UETHNICBG as 'Ethnicity',
 C2.UFULLPART as 'Full/Part',
 SLF_CLG_CS.EXT as 'College - GPA',
 OFF_CLG_CS.EXT as 'College - GPA Official',
 HS_OFF_CS.LINKACCT as 'HS GPA - Official',
 OFF_SAT_COMP.LINKACCT as 'SAT - Verbal',
 OFF_SAT_COMP.COUNTRY as 'SAT - Math',
 (Cast(OFF_SAT_COMP.LINKACCT as float) + Cast(OFF_SAT_COMP.COUNTRY as float)) as 'SAT - Composite',
 OFF_SAT_COMP.EXT as 'SAT - Essay',
 OFF_ACT_COMP.LINKACCT as 'ACT - English',
 OFF_ACT_COMP.COUNTRY as 'ACT - Math',
 OFF_ACT_COMP.ZIP as 'ACT - Reading',
 OFF_ACT_COMP.EXT as 'ACT - ScRe',
 (Cast(OFF_ACT_COMP.LINKACCT as float) + Cast(OFF_ACT_COMP.COUNTRY as float)+ Cast(OFF_ACT_COMP.ZIP as float) + Cast(OFF_ACT_COMP.EXT as float)) as 'ACT - Official' 
     from contact1 C
 left join CONTACT2 C2 on C.ACCOUNTNO=C2.ACCOUNTNO
 left join CONTSUPP HS_OFF_CS on C.ACCOUNTNO=HS_OFF_CS.ACCOUNTNO
  AND HS_OFF_CS.STATE='O' AND HS_OFF_CS.CONTACT='High School'
 left join CONTSUPP SLF_CLG_CS on C.ACCOUNTNO=SLF_CLG_CS.ACCOUNTNO
  AND SLF_CLG_CS.CONTACT = 'Transfer College' AND SLF_CLG_CS.STATE='S'
 left join CONTSUPP OFF_CLG_CS on C.ACCOUNTNO=OFF_CLG_CS.ACCOUNTNO
  AND OFF_CLG_CS.CONTACT = 'Transfer College' AND OFF_CLG_CS.STATE='O'
 left join CONTSUPP OFF_SAT_COMP on C.ACCOUNTNO=OFF_SAT_COMP.ACCOUNTNO
  AND OFF_SAT_COMP.CONTACT='Test/SAT' AND OFF_SAT_COMP.ZIP='O'
 left join CONTSUPP OFF_ACT_COMP on C.ACCOUNTNO=OFF_ACT_COMP.ACCOUNTNO
  AND OFF_ACT_COMP.CONTACT='Test/ACT' AND OFF_ACT_COMP.STATE='O'
     where 
 C.KEY1!='00PRSP' 
 AND C.U_KEY2='2010 FALL'
+3  A: 

A left join will produce duplicates in a 1-to-many relationship. Regardless of how many records are in your first table, if you left join to a table with multiple rows for each record in the first table you'll get more than one row. Select Distinct will remove duplicates if the rows are actually duplicated for all columns, but will not eliminate 'duplicates' that have a different value in any column.

g.d.d.c
Thanks. That was helpful...but doing a Select Distinct doesn't resolve the issue. Is there a way for me to make it not do this duplication? I'm pulling the exact info. from the other rows I want...
davemackey
@davemackey - If Select Distinct does not remove your duplicates then they're not technically duplicates. If you look through your result set there are bound to be rows that differ in one column or another. If you're selecting the identity columns from any of your auxiliary tables that would be enough. Otherwise, it indicates that in some of your auxiliary tables' columns there are different data values.
g.d.d.c
A: 

A quick way of identifying where duplicates are coming from if you've SHOW PLAN rights on the server - add a WHERE clause (e.g. WHERE C.ACCOUNTNO='some value') that you would expect to bring back a single row (but where you've identified that the value actually brings back > 1 row), enable "Include Actual Execution Plan", run the query and hover over the links between the stages of the plan - at some point you'll find that > 1 record is emanating from a particular stage and looking at this stage's details can shed light on the cause of the duplication.

Will A