views:

337

answers:

3

In a previous question I asked about how to group XML elements logically, and I got the answer, which was to nest the Linq query.

Problem is, this has the effect of left-joining the nested queries. For example, let's say I want to list all the cities in the USA that begin with the letter "Y", grouped by State and County:

XElement xml = new XElement("States",
  from s in LinqUtils.GetTable<State>()
  orderby s.Code 
  select new XElement("State",
    new XAttribute("Code", s.Code),
    new XAttribute("Name", s.Name),
    from cy in s.Counties
    orderby cy.Name
    select new XElement("County",
      new XAttribute("Name", cy.Name),
      from c in cy.Cities
      where c.Name.StartsWith("Y")
      orderby c.Name
      select new XElement("City",
        new XAttribute("Name", c.Name)
      )
    )
  )
);

Console.WriteLine(xml);

This outputs:

<States>
  <State Code="AK" Name="Alaska ">
    <County Name="ALEUTIANS EAST" />
    <County Name="ALEUTIANS WEST" />
    <County Name="ANCHORAGE" />
    <County Name="BETHEL" />
    ...
    <County Name="YAKUTAT">
      <City Name="YAKUTAT" />
    </County>
    <County Name="YUKON KOYUKUK" />
  </State>
  <State Code="AL" Name="Alabama ">
    <County Name="AUTAUGA" />
    ...
    etc.

I don't want the left-join effect; I only want to see the states and counties that actually contain cities beginning with the letter "Y".

I can think of a few ways to do this, but they all seem kludgy and inelegant. What is the neatest way you can think of to achieve the desired effect?

A: 

Here's one approach: You first create the query with all the correct inner joins, then you create the outer groupings using a Distinct() filter, then create the XML from the groupings using a where clause to join them. Thus:

var Cities = from s in LinqUtils.GetTable<State>()
             from cy in s.Counties
             from c in cy.Cities
             where c.Name.StartsWith("Y")
             select c;

var States = Cities.Select(c => c.County.State).Distinct();
var Counties = Cities.Select(c => c.County).Distinct();

XElement xml = new XElement("States",
  from s in States
  orderby s.Code
  select new XElement("State",
    new XAttribute("Code", s.Code),
    new XAttribute("Name", s.Name),
    from cy in Counties
    where cy.StateCode == s.Code
    orderby cy.Name
    select new XElement("County",
      new XAttribute("Name", cy.Name),
      from c in Cities
      where c.CountyID == cy.ID
      orderby c.Name
      select new XElement("City",
        new XAttribute("Name", c.Name)
      )
    )
  )
);

It works, but somehow I have the feeling that there's a better way...

Shaul
+1  A: 

I think you have a good start. You can add information about countries and states to your Cities list, and then group by them, and avoiding the second join and filter.
You can even do this in one big linq query. It's hard to write exactly what you need because you have your own classes, but here's something similar with files and folders (you'll need to add another level):

dirs = new List<DirectoryInfo>();
dirs.Add(new DirectoryInfo("c:\\"));
dirs.Add(new DirectoryInfo("c:\\windows\\"));

var a = from directory in dirs
        from file in directory.GetFiles()
        where file.Name.StartsWith("a")
        group file by directory.Name into fileGroup
        select new XElement("Directory", new XAttribute("path", fileGroup.Key),
            from f in fileGroup
            select new XElement("File", f.Name)
            );

XDocument doc = new XDocument(new XElement("Folders", a));

Resulting in the XML:

<Folders>
  <Directory path="c:\">
    <File>ActiveDirectoryService.cs</File>
    <File>ApplicationTemplateCore.wsp</File>
    <File>AUTOEXEC.BAT</File>
  </Directory>
  <Directory path="windows">
    <File>adfs.msp</File>
    <File>adminscript2nd.exe</File>
    <File>aspnetocm.log</File>
  </Directory>
</Folders>

Again, the key here is to use group by on the results.

Kobi
Toda rabba... but I'm having difficulty with the double-nesting of the group-by. Could you please give an example of how to do the double-nesting, such as with State->County->City? Thanks!
Shaul
+2  A: 

There are several ways to solve this problem, but none are exceedingly elegant. A few options:

Option 1: Use let to capture the subqueries and filter out the empty values:

XElement xml = new XElement("States",
  from s in LinqUtils.GetTable<State>()
  let counties = from cy in s.Counties
                 let cities = from c in cy.Cities
                              where c.Name.StartsWith("Y")
                              orderby c.Name
                              select new XElement("City",
                                new XAttribute("Name", c.Name)
                              )
                 where cities.Any()
                 orderby cy.Name
                 select new XElement("County",
                   new XAttribute("Name", cy.Name),
                   cities          
                 )
  where counties.Any()
  orderby s.Code 
  select new XElement("State",
    new XAttribute("Code", s.Code),
    new XAttribute("Name", s.Name),
    counties
  )
);

Option 2: Use your inner join approach with group by instead of distinct:

XElement xml = new XElement("States",
  from s in LinqUtils.GetTable<State>()
  from cy in s.Counties
  from c in cy.Cities
  where c.Name.StartsWith("Y")
  group new { cy, c } by s into gs
  let s = gs.Key
  orderby s.Code 
  select new XElement("State",
    new XAttribute("Code", s.Code),
    new XAttribute("Name", s.Name),

    from g in gs
    group g.c by g.cy into gcy
    let cy = gcy.Key
    orderby cy.Name
    select new XElement("County",
      new XAttribute("Name", cy.Name),

      from c in gcy
      orderby c.Name
      select new XElement("City",
        new XAttribute("Name", c.Name)
      )
    )
  )
);
dahlbyk
Excellent! Thanks for showing me how to do the nested grouping using the anonymous type - that gets you the answer credit! :)
Shaul