views:

82

answers:

1

I'm attempting to write a linq query which uses several tables of related data and have gotten stuck.

The expected result: I need to return the three most populous metropolitan areas per region by population descending.

tables w/sample data:

MetroAreas -- ID, Name
2, Greater New York

Cities -- ID, Name, StateID
1293912, New York City, 10

CityPopulations -- ID, CityID, CensusYear, Population
20, 1293912, 2008, 123456789
21, 1293912, 2007, 123454321

MetroAreaCities -- ID, CityID, MetroAreaID
1, 1293912, 2

States -- ID, Name, RegionID
10, New York, 5

Regions -- ID, Name
5, Northeast

I start with the metro areas. Join the MetroAreaCities to get city IDs. Join Cities to get state IDs. Join States to get the region ID. Join regions so I can filter with a where. I get stuck when I try to include CityPopulations. I only want the three most populous metro areas for a given region. Doing a simple join on the cityPopulations returns a record per year.

(Here's what I have so far, this query was written for SubSonic 3):

return from p in GeoMetroArea.All()  
       join q in GeoMetroAreaCity.All() on p.ID equals q.MetroAreaID  
       join r in GeoCity.All() on q.CityID equals r.ID  
       join s in GeoState.All() on r.StateID equals s.ID  
       join t in GeoRegion.All() on s.RegionID equals t.ID  
       where t.ID == regionObjectPassedToMethod.ID  
       select p;

Can anyone help me with this query or point me in the right direction? Thank you very very much.

+1  A: 

I haven't compiled it, but this should get you close:

var regionID = 5;

var year = (from c in GeoCityPopulation.All()
            select c.CensusYear
           ).Max();

var metros =
    // States in Region
    from s in GeoStateAll()
    where s.RegionID == regionID
    // Cities in State
    join c in GeoCity.All() on s.CityID equals c.ID
    // Metro Area for City
    join mc in GeoMetroAreaCity.All() on c.ID equals mc.CityID
    // Population for City
    join cp in GeoCityPopulation.All() on c.ID equals cp.CityID
    where cp.CensusYear = year
    // Group the population values by Metro Area
    group cp.Population by mc.MetroAreaID into g
    select new
    {
        MetroID = g.Key,      // Key = mc.MetroAreaID
        Population = g.Sum()  // g = seq. of Population values
    } into mg
    // Metro for MetroID
    join m in GeoMetroArea.All() on mg.MetroID equals m.ID
    select new { m.Name, mg.Population };
dahlbyk
Brilliant!!! Worked with very minor changes... Kudos!I'm playing serious catchup with linq and don't pretend to completely understand everything you did. Any chance you could show me how to alter the where cp.CensusYear == year line to just pull the maximum/greatest year that exists in the CityPopulation table?
jas
I've added a query to grab the greatest CensusYear. As written, that query will execute separately; if the SubSonic LINQ provider is smart enough, you can remove the 'year' variable altogether and put that query inline instead to avoid a second trip to the database.
dahlbyk