views:

193

answers:

2

Situation:

I have some persons with certain skills and they can/might belong to more than one area. The skills are linked in a seperate table, so are the areas.

I get the people list from selecting all persons with a match on each skill and add them to a list where I can use Distinct() to make sure that they dont show up twice.

Resulting list of the persons:

List<Person> peopleWithRightSkills

On each [Person] object I have at least 1 address linked, but they can have more as its in a relation to [Person]

I have another list:

List<PostalCode> acceptedPostalcodes

Now I need to compare and filter those peopleWithRightSkills who have an address where the postal code of the address is within the acceptedPostalcodes

I have been investigating the Lambda expressions, the SelectMany among other solutions, but right now, I have only one option which I believe is the "old style" of doing things, namely to run through each Person and for each person match her/his address list against the list of postalcodes. and for each match then add this to the:

List<Person> matchedPeople

Table overview (shortend down the needed details)

[Table:Person]
int:ID (primary)
string:FirstName
string:LastName

[Table:Address]
int:Person_ID (foreign key to Person)
int:PostalCode_ID (foreing key to PostalCode)
string:StreetName

[Table:PostalCode]
int:ID
string:CityName

As I see the problem, its just a "short list pr. person"(minimum 1, perhaps up to 10 addresses) and I need to compare this address list against a "valid postalcode list" for each person.

In hope of a great answer to this, as I've been stuck for some hours now, trying to figure out what syntax to use to solve this more beautiefull and less performance hard.

+2  A: 

Assuming that there are a lot of people, acceptedPostalCodes shouldn't be a list; it should either be a sorted list / binary tree or a hash table, depending on how many codes there are. That in itself should be sufficient to give you an order-of-magnitude performance increase. Then, yeah, just check each person and accept that person if their address is in acceptedPostalCodes.

There's not really any better way to do this, unless you have strange data (i.e. if the same addresses appear over and over, you could cache the results for those addresses in some kind of ancillary structure.)

I'm afraid I don't really understand what you're getting at in the rest of the question by presenting the table structure, so I hope I didn't miss some subtlety to what you're doing.

EDIT: Since you seem interested in doing things with LINQ, here's how you'd pick out the elements from peopleWithRightSkills which have acceptedPostalCodes:

var matchedPeople =
peopleWithRightSkills.Where(p => acceptablePostalCodes.Contains(p.Address));
mquander
I was hoping for something with .Contains or .SelectMany ? really no solution to this approach?
BerggreenDK
Sure, you can use acceptedPostalCodes.Contains(address) instead of doing a loop. That isn't going to have any effect on performance, though.
mquander
Thanks, going to check if this works. Stay tuned!
BerggreenDK
Thanks for your answers and support - and beings so fast to reply!
BerggreenDK
+4  A: 
List<int> peopleIDs = peopleWithRightSkills.Select(p => p.ID).ToList();
List<int> postalIDs = acceptedPostalCodes.Select(c => c.ID).ToList();

var query = db.Persons
  .Where(p => peopleIDs.Contains(p.ID)
  .Where(p => p.Addresses.Any(a => postalIDs.Contains(a.PostalCode_ID))
  );

LinqToSql will translate each element in the List<int> to a parameter. Then it will translate the Contains method calls into TSql IN clauses.

Be aware that LinqToSql will happily translate as many elements as you want into parameters (I've seen 50k myself), however SqlServer will only accept ~2000 parameters. If your list contains more elements than that you need to break up those lists. If you have 1500 people with right skills and 1000 acceptedPostalCodes, you would need to send 2500 parameters into SQL Server, which is 400 too many and it will give you a SqlException.

David B
This is sure the right method if you're actually using LINQ-to-SQL and you don't already have everything sitting on the client working on it.
mquander
Yup, so the point of this is to get away with never loading the addresses.
David B
I feel totally newbie here. :o) But thanks, I will have a go at this one too then. I am not sure what you meant by the last sentence with parameters. Can you elaborate on that?
BerggreenDK
It's a pure webapplication all data comes from SQL-server. So the less data I need to pull to make the compares, the better indeed!
BerggreenDK
just checked my records, currently as test we have approx 1300 postalcodes and at least 1200 persons and this number will grow. Hmmm. I guess I better have to stick with my first attempt of getting both the persons and the postalcodes out once and then try to "filter/match" them in memory instead.I will post when I reach something.
BerggreenDK
Last update, I forgot to mention that the "people with skills" returns less than 100 people out of the list, so I guess that we can relax for a while. I am VERY happy about both your help here. I still am a bit confused about WHY/HOW those Lambda expressions work, but somethings seems to make sense. I guess its like regular expressions and recursive. You just have to trust it. :o)
BerggreenDK