We have the following query to give us a left outer join:
(from t0 in context.accounts
join t1 in context.addresses
on new { New_AccountCode = t0.new_accountcode, New_SourceSystem = t0.new_sourcesystem, New_Mailing = t0.new_MailingAddressString }
equals new { New_AccountCode = t1.new_AccountCode, New_SourceSystem = t1.new_SourceSystem, New_Mailing = t1.new_MailingAddressString } into t1_join
from t1 in t1_join.DefaultIfEmpty()
where
t0.statecode != 1 &&
t0.statuscode != 2 &&
t1.new_AccountCode == null &&
t1.new_SourceSystem == null &&
t1.new_MailingAddressString == null
select t0)
.OrderBy(o => o.new_accountcode)
.ThenBy(o2=>o2.new_sourcesystem)
.Skip(recordsProcessed)
.Take(recordBatchSize).ToList();
The issue is that if the left table (accounts) contains multiple rows with the same accountcode value, the result set contains the first row duplicated - so the second row with it's unique combination of accountcode, sourcesystem and mailingaddressstring is "overwritten".
Given:
accounts
accountcode sourcesystem mailingaddressstring
10025 ss1 12345
10025 ss2 67891
addresses
accountcode sourcesystem mailingaddressstring
10025 ss1 12345
10025 ss2 67891
we get:
accountcode sourcesystem mailingaddressstring
10025 ss1 12345
10025 ss1 12345
Are we doing something wrong with the select statement?
Thanks