views:

85

answers:

1

Hi,

I'm using Linq to Sql (in fact it's Dynamic Linq to SQL that allows you to pass strings at runtime for where clauses, orderby etc.) But I'm getting some different results and it seems to be based on whether the underlying T-SQL is using the TOP keyword or using BETWEEN.

I've tried to the break the problem down into a small example, here's the scenario:

I'm using a repository pattern and the following method that simply joins 2 tables with a left outer join.

   public IQueryable<TestGalleryViewModel> FetchGalleryItems()
    {

        var galleryItems = from painting in Gallery
                           join artist in Artists
                               on painting.ArtistID equals artist.ArtistID

                           into paintingArtists
                           from artist in paintingArtists.DefaultIfEmpty()

                           select new TestGalleryViewModel
                           {

                               Id = painting.PaintingID,
                               ArtistName = artist == default(Artist) ? "" : artist.Surname + " " + artist.Forenames,
                           };

        return galleryItems;
    }

I then have a little test method that uses the FetchGalleryItems method:

        var query = respository.Test_FetchGalleryItems().Where("ArtistName.Contains(\"Adams Charles James\")");

        var orderedlist = query.OrderBy("ArtistName asc");
        var page1 = orderedlist.Skip(0).Take(5);
        var page2 = orderedlist.Skip(5).Take(5);

The orderedList contains the following underlying values:

176 ADAMS Charles James
620 ADAMS Charles James
621 ADAMS Charles James
660 ADAMS Charles James
683 ADAMS Charles James
707 ADAMS Charles James
735 ADAMS Charles James
739 ADAMS Charles James
740 ADAMS Charles James
741 ADAMS Charles James

Which is what I would expect. But page1 contains

707 ADAMS Charles James
683 ADAMS Charles James
660 ADAMS Charles James
621 ADAMS Charles James
620 ADAMS Charles James

Which as you can see is NOT the first 5 items. Page2 contains

707 ADAMS Charles James
735 ADAMS Charles James
739 ADAMS Charles James
740 ADAMS Charles James
741 ADAMS Charles James

Whis is what I would expect, it is items 6 to 10.

The underlying T-SQL for page1 is

SELECT TOP (5) [t3].[PaintingID] AS [Id], [t3].[value] AS [ArtistName]
FROM (
    SELECT [t0].[PaintingID], 
        (CASE 
            WHEN [t2].[test] IS NULL THEN CONVERT(NVarChar(101),'')
            ELSE ([t2].[Surname] + ' ') + [t2].[Forenames]
         END) AS [value]
    FROM [dbo].[Gallery] AS [t0]
    LEFT OUTER JOIN (
        SELECT 1 AS [test], [t1].[ArtistID], [t1].[Surname], [t1].[Forenames]
        FROM [dbo].[Artists] AS [t1]
        ) AS [t2] ON [t0].[ArtistID] = ([t2].[ArtistID])
    ) AS [t3]
WHERE [t3].[value] LIKE '%Adams Charles James%'
ORDER BY [t3].[value]

Notice it's using TOP(5)

The underlying T-SQL for page2 is

SELECT [t4].[PaintingID] AS [Id], [t4].[value] AS [ArtistName]
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY [t3].[value], [t3].[Surname], [t3].[Forenames]) AS [ROW_NUMBER], [t3].[PaintingID], [t3].[value]
    FROM (
        SELECT [t0].[PaintingID], 
            (CASE 
                WHEN [t2].[test] IS NULL THEN CONVERT(NVarChar(101),'')
                ELSE ([t2].[Surname] + ' ') + [t2].[Forenames]
             END) AS [value], [t2].[Surname], [t2].[Forenames]
        FROM [dbo].[Gallery] AS [t0]
        LEFT OUTER JOIN (
            SELECT 1 AS [test], [t1].[ArtistID], [t1].[Surname], [t1].[Forenames]
            FROM [dbo].[Artists] AS [t1]
            ) AS [t2] ON [t0].[ArtistID] = ([t2].[ArtistID])
        ) AS [t3]
    WHERE [t3].[value] LIKE '%Adams Charles James%'
    ) AS [t4]
WHERE [t4].[ROW_NUMBER] BETWEEN 5 + 1 AND 5 + 5
ORDER BY [t4].[ROW_NUMBER]

Notice it's using BETWEEN

When I paste the T-SQL commands into SQL Express Management Studio I get the results I've described. If I used the page2 T-SQL and amended the line

 WHERE [t4].[ROW_NUMBER] BETWEEN 5 + 1 AND 5 + 5

to be

WHERE [t4].[ROW_NUMBER] BETWEEN 1 AND 5

I get the results I was expecting for page1. i.e. The first 5 items.

176 ADAMS Charles James
620 ADAMS Charles James
621 ADAMS Charles James
660 ADAMS Charles James
683 ADAMS Charles James

So in a nutshell when the T-SQL uses Between instead of TOP I get the results I expected.

I'm using filtering (where clause), sorting (orderBy) and paging (skip and take) all over my app and need to handle this fairly generically.

  • Is there a way to force LINQ to use the BETWEEN syntax instead of TOP ?
  • Should I be approaching this differently ?

Apologies for the long post.

Regards, Simon

A: 

Regardless of how the SQL is generated (LINQ or otherwise), if you ORDER BY a column that has duplicate values, you can get different results every time you run the query.

When you ORDER BY [t3].[value] you are sorting on a column containing many duplicate values.

You can test this by running a very simple SQL SELECT from Management Studio. Every time you run it, you'll get a different result.

One way to get consistent results is to use ROW_NUMBER as you have done. Alternately, adding any other column to the ORDER BY that is unique will cause the results to always be returned in the same order. It doesn't matter whether that other column has anything to do with your query, just that it's unique.

DOK
Thanks DOK,You say one way to get consistent reults is to use "ROW_NUMBER as you have done". In fact I have not used ROW_NUMBER, linq to Sql has generated that code and I don't appear to have any controll over it. That was really my question - can I tell linq to sql to use "row_number" and "between" instead of just "TOP" for the first page. In the end I've had to add a unique column to all my entites and tack that on the end of my order by clause as you suggested. It was somethink I was already toying with but wanted to know if there was a better solution. Thanks for your reply.
Simon Lomax