tags:

views:

1634

answers:

10

Hi. I have the following (pretty standard) table structure:

Post <-> PostTag <-> Tag

Suppose I have the following records:

PostID Title
1,     'Foo'
2,     'Bar'
3,     'Baz'

TagID Name
1,    'Foo'
2,    'Bar'

PostID TagID
1      1
1      2
2      2

In other words, the first post has two tags, the second has one and the third one doesn't have any.

I'd like to load all posts and it's tags in one query but haven't been able to find the right combination of operators. I've been able to load either posts with tags only or repeated posts when more than one tag.

Given the database above, I'd like to receive three posts and their tags (if any) in a collection property of the Post objects. Is it possible at all?

Thanks

A: 

I've answered this in another post : About eager loading. In your case it would probably be something like :

DataLoadOptions options = new DataLoadOptions();    
options.LoadWith<Post>(p => p.PostTag);
options.LoadWith<PostTag>(pt => pt.Tag);

Though be careful - the DataLoadOptions must be set BEFORE ANY query is sent to the database - if not, an exception is thrown (no idea why it's like this in Linq2Sql - probably will be fixed in a later version).

sirrocco
A: 

I'm sorry no, Eager Loading will execute one extra query per tag per post.

Tested with this code:

var options = new DataLoadOptions();
options.LoadWith<Post>(p => p.PostTags);
options.LoadWith<PostTag>(pt => pt.Tag);
using (var db = new BlogDataContext())
{
    db.LoadOptions = options;
    return (from p in db.Posts
            where p.Status != PostStatus.Closed
            orderby p.PublishDateGmt descending
            select p);
}

In the example database it would execute 4 queries which is not acceptable in production. Can anyone suggest another solution?

Thanks

A: 

Anyone has an answer?

+1  A: 

It's a bit strange because

DataLoadOptions o = new DataLoadOptions ( );
o.LoadWith<Listing> ( l => l.ListingStaffs );
o.LoadWith<ListingStaff> ( ls => ls.MerchantStaff );
ctx.LoadOptions = o;

IQueryable<Listing> listings = (from a in ctx.Listings
   where a.IsActive == false 
                            select a);
List<Listing> list = listings.ToList ( );

results in a query like :

SELECT [t0].*, [t1].*, [t2].*, (
SELECT COUNT(*)
FROM [dbo].[LStaff] AS [t3]
INNER JOIN [dbo].[MStaff] AS [t4] ON [t4].[MStaffId] = [t3].[MStaffId]
WHERE [t3].[ListingId] = [t0].[ListingId]
) AS [value]
FROM [dbo].[Listing] AS [t0]
LEFT OUTER JOIN ([dbo].[LStaff] AS [t1]
INNER JOIN [dbo].[MStaff] AS [t2] ON [t2].[MStaffId] = [t1].[MStaffId]) ON 
[t1].[LId] = [t0].[LId] WHERE NOT ([t0].[IsActive] = 1) 
ORDER BY [t0].[LId], [t1].[LStaffId], [t2].[MStaffId]

(I've shortened the names and added the * on the select).

So it seems to do the select ok.

sirrocco
A: 

I'm sorry. The solution you give works, but I found out that it breaks when paginating with Take(N). The complete method I'm using is the following:

public IList<Post> GetPosts(int page, int records)
{
    var options = new DataLoadOptions();
    options.LoadWith<Post>(p => p.PostTags);
    options.LoadWith<PostTag>(pt => pt.Tag);
    using (var db = new BlogDataContext())
    {
        db.LoadOptions = options;
        return (from p in db.Posts
                where p.Status != PostStatus.Closed
                orderby p.PublishDateGmt descending
                select p)
                .Skip(page * records)
                //.Take(records)
                .ToList();
    }
}

With the Take() method commented it generates a query similar to to what you posted but if I add the Take() again it generates 1 + N x M queries.

So, I guess my question now is: Is there a replacement to the Take() method to paginate records?

Thanks

A: 

Mmmm, I understand why now. The left join will bring an indeterminate number of rows per post so Take(10) can't simply return 10 rows.

Maybe if I could bypass Take() (keep the TOP 10 in the SQL query) I could make a COUNT() query before and limit the query to COUNT() rows.

This way it would just make 2 queries instead 1 + N x M queries:

Any ideas?

+1  A: 

Yay! It worked.

If anyone is having the same problem here's what I did:

public IList<Post> GetPosts(int page, int record)
{
 var options = new DataLoadOptions();
 options.LoadWith<Post>(p => p.PostTags);
 options.LoadWith<PostTag>(pt => pt.Tag);
 using (var db = new DatabaseDataContext(m_connectionString))
 {
  var publishDateGmt = (from p in db.Posts
         where p.Status != PostStatus.Hidden
         orderby p.PublishDateGmt descending
         select p.PublishDateGmt)
         .Skip(page * record)
         .Take(record)
         .ToList()
         .Last();
  db.LoadOptions = options;
  return (from p in db.Posts
    where p.Status != PostStatus.Closed 
     && p.PublishDateGmt >= publishDateGmt
    orderby p.PublishDateGmt descending
    select p)
    .Skip(page * record)
    .ToList();
 }
}

This executes only two queries and loads all tags for each post.

The idea is to get some value to limit the query at the last post that we need (in this case the PublishDateGmt column will suffice) and then limit the second query with that value instead of Take().

Thanks for your help sirrocco.

A: 

Take a look at the solution at http://agilepractice.blogspot.com/2009/10/sql-to-linq-performance-retrieving.html

It allows pre-fetching of any entities you want to load.

Mr B
A: 

Thank you Mr B. Your tip worked out just great! I was able to reduce the number of database hits from about 6000 to 2 in one of my applications! Great!

Dr X
Why was this rated down?
Mike Pateras
Because it's not an answer to the original question - this isn't a forum.
tomfanning
A: 

I know this is an old post, but I have discovered a way to use Take() while only performing one query. The trick is to perform the Take() inside of a nested query.

var q = from p in db.Posts
        where db.Posts.Take(10).Contains(p)
        select p;

Using DataLoadOptions with the query above will give you the first ten posts, including their associated tags, all in one query. The resulting SQL will be a much less concise version of the following:

SELECT p.PostID, p.Title, pt.PostID, pt.TagID, t.TagID, t.Name FROM Posts p
JOIN PostsTags pt ON p.PostID = pt.PostID
JOIN Tags t ON pt.TagID = t.TagID
WHERE p.PostID IN (SELECT TOP 10 PostID FROM Posts)
Jordan