views:

57

answers:

3

I have an example database, it contains tables for Movies, People and Credits. The Movie table contains a Title and an Id. The People table contains a Name and an Id. The Credits table relates Movies to the People that worked on those Movies, in a particular role. The table looks like this:

CREATE TABLE [dbo].[Credits] (
    [Id] [int] IDENTITY (1, 1) NOT NULL PRIMARY KEY,
    [PersonId]  [int] NOT NULL FOREIGN KEY REFERENCES People(Id),
    [MovieId]  [int] NOT NULL  FOREIGN KEY REFERENCES Movies(Id),
    [Role]  [char] (1) NULL

In this simple example, the [Role] column is a single character, by my convention either 'A' to indicate the person was an actor on that particular movie, or 'D' for director.

I'd like to perform a query on a particular person that returns the person's name, plus a list of all the movies the person has worked on, and the roles in those movies.

If I were to serialize it to json, it might look like this:

{
  "name" : "Clint Eastwood",
  "movies" : [
     { "title": "Unforgiven",        "roles": ["actor", "director"] },
     { "title": "Sands of Iwo Jima", "roles": ["director"] },
     { "title": "Dirty Harry",       "roles": ["actor"] },
     ...
  ]
}

How can I write a LINQ-to-SQL query that shapes the output like that?

I'm having trouble doing it efficiently.


Try #1

if I use this query:

  int personId = 10007;
  var persons =
      from p in db.People
      where p.Id == personId
      select new
      {
          name   = p.Name,
          movies =
                (from m in db.Movies
                 join c in db.Credits on m.Id equals c.MovieId
                 where (c.PersonId == personId)
                 select new {
                         title = m.Title,
                         role = (c.Role=="D"?"director":"actor")
                 })
      };

I get something like this:

{
  "name" : "Clint Eastwood",
  "movies" : [
     { "title": "Unforgiven",        "role": "actor" },
     { "title": "Unforgiven",        "role": "director" },
     { "title": "Sands of Iwo Jima", "role": "director" },
     { "title": "Dirty Harry",       "role": "actor" },
     ...
  ]
}

That's not quite right. As you can see there's a duplicate of each movie for which Eastwood played multiple roles. I would expect that because there are multiple rows in the credits table for that movie+person combination, one for each role.


Try #2

I thought I'd use a group by, like this:

  var persons =
      from p in db.People
      where p.Id == personId
      select new
      {
          name   = p.Name,
          movies =
                (from m in db.Movies
                 join c in db.Credits  on m.Id equals c.MovieId
                 where (c.PersonId == personId)
                 orderby m.Year
                 group ((c.Role == "A")? "actor":"director")
                 by m.Id
                 into g
                 select new {roles = g })
      };

The output is pretty close to what I want. It looks like this:

{
  "name" : "Clint Eastwood",
  "movies" : [
     { "roles": ["actor", "director"]}, 
     { "roles": ["director"]}, 
     { "roles": ["actor"]},
     ...
  ]
}

That's close, but of course I don't have the movie titles.


Try #3

If I use a group by and include the movie title, like this:

  var persons =
      from p in db.People
      where p.Id == personId
      select new
      {
          name   = p.Name,
          movies =
                (from m in db.Movies
                 join c in db.Credits  on m.Id equals c.MovieId
                 where (c.PersonId == personId)
                 orderby m.Year
                 group ((c.Role == "A")? "actor":"director")
                 by m.Id
                 into g
                 select new { title = m.Title, roles = g })
      };

...then it won't compile, due to

error CS0103: The name 'm' does not exist in the current context


How can I shape the output the way I want?

+1  A: 

I believe that you'll need to materialize the query, then group by name and title and use string.Join to collate the roles.

  int personId = 10007;
  var persons = db.People.Where( p => p.Id == personId );
  var movies = db.Movies
                 .Join( db.Credits.Where( c => c.PersonId == personId),
                        m => m.Id,
                        c => c.MovieId,
                       (m,c) => new {
                   personid = c.PersonId,
                   title = m.title,
                   role = c.Role == "D" : "director", "actor"
                  })
                 .GroupBy( g => new { g.personid, g.title } )
                 .ToList()
                 .Select( g => new {
                     personid = g.Key.personid,
                     title = g.Key.title
                     roles = string.Join( ",", g.Select( g => g.role ).ToArray() )
                  });

  var personsWithMovies = people.Join( movies, p => p.PersonId, m => m.personid, (p,m) => new {
                            name = p.Name,
                            movies = m 
                          });
tvanfosson
I don't think that works - it won't compile for me, tells me `'System.Linq.IQueryable<AnonymousType#1>' does not contain a definition for 'title' and no extension method 'title' etc etc` .
Cheeso
A little more complicated than I thought. I think this should work.
tvanfosson
+2  A: 

It's a lot easier to reason about if you start from the relation table (credits):

var query =
    from c in context.Credits
    where c.PersonId == 1
    group c by c.Person into g
    select new
    {
        PersonName = g.Key.Name,
        Credits = from cr in g
                  group cr by cr.Movie into g2
                  select new
                  {
                      MovieTitle = g2.Key.Name,
                      Roles = g2.Select(ci =>
                          (ci.Role == 'A') ? "Actor" : "Director")
                  }
    };

Here's code that will display the results:

foreach (var result in query)
{
    Console.WriteLine(result.PersonName);
    foreach (var credit in result.Credits)
    {
        string roles = string.Join(",", credit.Roles.ToArray());
        Console.WriteLine("  " + credit.MovieTitle + ": " + roles);
    }
}
Aaronaught
I had a feeling that I Should be flipping it around, but I didn't quite get to that. . . I still there there are problems with it, though. The part that reads `group c by c.Person` -- how does that get satisfied? I'm not clear because the Credits table has a PersonId column, but not a Person column. It's important to get the person's name, but without joining on the Person table I won't have it. maybe you just left that part out.
Cheeso
@Cheeso: It doesn't have a `Person` column but it should have a `Person` association, and when you group by this it knows how to use the association column (`PersonId`) when translating to SQL. Associations get created automatically if you have FKs in the database and you drag the tables into the designer (which is exactly what I did here). I've actually copied-and-pasted this code verbatim from a test project. If you're missing the association then you're making life a lot more difficult for yourself...
Aaronaught
Yep, I don't have the association; I was guessing that was from some sort of generated code. I didn't use VS to generate the ORM code; I used sqlmetal.exe from the command line. I suppose that I'll have to explictly do the joins to make those things happen, with code from sqlmetal. That brings up a new question - is there a way to automatically create those associations using sqlmetal? I'll research that separately. Anyway Thx for the help with the syntax. The actual query I used is here: http://stackoverflow.com/questions/2481403/linq-to-sql-how-to-shape-the-data-with-group-by/2481546#2481546
Cheeso
@Cheeso: Bizarre, I've used sqlmetal before and it is supposed to generate associations automatically, there's no special switch for it or anything. The only reason I can think of why it wouldn't is if you don't actually have a foreign key like the one in your original question. Could the relationship have been missing when you first ran sqlmetal?
Aaronaught
YES. I was messing with the table defn and had inadvertently removed one of the FK's. I've re-created the table and the code and it now works (entity names are Movies and People, not Movie and Person). Thank you for that bit of instruction. very helpful.
Cheeso
No problem, good to hear you got it up and running!
Aaronaught
A: 

Thanks to the hint from tvanfosson, I was able to come up with this, which works for me!

var persons =
     from p in db.People
     where p.Id == personId
     select new
     {
         name   = p.Name,
         movies =
               (from m in db.Movies
                join c in db.Credits on m.Id equals c.MovieId
                where (c.PersonId == personId)
                group ((c.Role =="A")?"actor":"director") by m into sg
                orderby sg.Key.year
                select new { title = sg.Key.Title, roles = sg } )
     };

I also took some hints from Aaronaught and tried starting with the Credits table, and using the generated associations. That made things simpler. This code also works:

var persons =
    from c in db.Credits
    where c.PersonId == arg
    group c by c.People into g
    select new
    {
        name = g.Key.Name,
        credits = from cr in g
            group ((cr.Role == "A") ? "actor" : "director")
            by cr.Movies into g2
            orderby g2.Key.Year
            select new { title = g2.Key.Title, roles = g2 }
    };

...and it produces the same (or equivalent) output when serialized the the JavaScriptSerializer.


The key realization for me, that allowed me to get this done, was that I could use a compound key for the group, and that I could select on the fields within the key. The second key realization was that I should use the generated associations.

Cheeso