views:

211

answers:

1

I am spending a good amount of time trying to figure out why Linq2Sql is changing my SQL query. It is rather difficult to explain, and I cant find any reason why this is happening. The low down is it appears that adding multiple contains around an IQueryable seems to overwrite each previous IQueryable Expression. Let me try and explain:

Say you have a Linq2Sql query that provides you the basic framework for query. (something that is the base of all your queries)

I am dynamically adding in parts of the where query (shown as "partQuery" in the examples below). The Expression generated from the where query is correct, and when I add it to the finalQuery -- it still is correct. The problem comes when I add another partQuery to the final query, it seems to overwrite the first query in the final query, but adds a second query into it. (or as shown below, when adding a thrid query, overwrites the first 2 queries)

Here is some source example:

        foreach (var partQuery in whereStatements)
        {
            finalQuery = finalQuery.Where(
                dataEvent => partQuery.Contains(dataEvent.DataEventID)
                );
        }

the partQuery is of of type IQueryable finalQuery is the query that will eventually be executed at the SQL server

        // the list of the wheres that are sent
        var whereStatements = new List<IQueryable<long>>();

        var query1 = DataEvent.GetQueryBase(context);
        query1 = query1.Where(
            dataEvent =>
            dataEvent.DataEventKeyID == (short)DataEventTypesEnum.TotalDollarAmount && dataEvent.ValueDouble < -50);

        whereStatements.Add(query1.Select(x => x.DataEventID));


        var query2 = DataEvent.GetQueryBase(context);
        query2 = query2.Where(
            dataEvent =>
            dataEvent.DataEventKeyID == (short)DataEventTypesEnum.ObjectNumber && dataEvent.ValueDouble == 6);

        whereStatements.Add(query2.Select(x => x.DataEventID));

The first where Query (query1) has an expression that turns out like this:

{SELECT [t0].[DataEventID] FROM [dbo].[DataEvents] AS [t0] INNER JOIN [dbo].[DataEventAttributes] AS [t1] ON [t0].[DataEventID] = [t1].[DataEventID] WHERE ([t1].[DataEventKeyID] = @p0) AND ([t1].[ValueDouble] < @p1) }

Notice that the where line has a ValueDouble "<" @p1 -- less then

and then when added into the final query, it looks like this:

{SELECT [t0].[DataEventID], [t0].[DataOwnerID], [t0].[DataTimeStamp] FROM [dbo].[DataEvents] AS [t0] WHERE (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t1] INNER JOIN [dbo].[DataEventAttributes] AS [t2] ON [t1].[DataEventID] = [t2].[DataEventID] WHERE ([t1].[DataEventID] = [t0].[DataEventID]) AND ([t2].[DataEventKeyID] = @p0) AND ([t2].[ValueDouble] < @p1) )) AND ([t0].[DataOwnerID] = @p2) }

At this point, the query is still correct. Notice how ValueDouble still has a "<" sign. The problem occurs when I add 2 or more to the query. Here is the expression of the second query in this example:

{SELECT [t0].[DataEventID] FROM [dbo].[DataEvents] AS [t0] INNER JOIN [dbo].[DataEventAttributes] AS [t1] ON [t0].[DataEventID] = [t1].[DataEventID] WHERE ([t1].[DataEventKeyID] = @p0) AND ([t1].[ValueDouble] = @p1) }

and when added to the final query.. you will notice that the first query is no longer correct.... (And more to come after)

{SELECT [t0].[DataEventID], [t0].[DataOwnerID], [t0].[DataTimeStamp] FROM [dbo].[DataEvents] AS [t0] WHERE (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t1] INNER JOIN [dbo].[DataEventAttributes] AS [t2] ON [t1].[DataEventID] = [t2].[DataEventID] WHERE ([t1].[DataEventID] = [t0].[DataEventID]) AND ([t2].[DataEventKeyID] = @p0) AND ([t2].[ValueDouble] = @p1) )) AND (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t3] INNER JOIN [dbo].[DataEventAttributes] AS [t4] ON [t3].[DataEventID] = [t4].[DataEventID] WHERE ([t3].[DataEventID] = [t0].[DataEventID]) AND ([t4].[DataEventKeyID] = @p2) AND ([t4].[ValueDouble] = @p3) )) AND ([t0].[DataOwnerID] = @p4) }

And a bonus to it... after looking at this via the SQl profiler, it appears that it totally dropped the first query, and the two Exists clauses in the final SQl are actually the same query (query2). None of the parameters are actually passed to the SQl server for the first query.

So, in my research of this, it appears that its adding queries to the SQl, but its replacing any existing where exists clause with the last one that was added. To double confirm this.. the exact same code as above, but I added a third query.... and look how it changed.

        var query3 = DataEvent.GetQueryBase(context);
        query3 = query3.Where(
            dataEvent =>
            dataEvent.DataEventKeyID != (short)DataEventTypesEnum.Quantity && dataEvent.ValueDouble != 5);

        whereStatements.Add(query3.Select(x => x.DataEventID));

I threw in some "!=" to the last part of the query

{SELECT [t0].[DataEventID], [t0].[DataOwnerID], [t0].[DataTimeStamp] FROM [dbo].[DataEvents] AS [t0] WHERE (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t1] INNER JOIN [dbo].[DataEventAttributes] AS [t2] ON [t1].[DataEventID] = [t2].[DataEventID] WHERE ([t1].[DataEventID] = [t0].[DataEventID]) AND ([t2].[DataEventKeyID] <> @p0) AND ([t2].[ValueDouble] <> @p1) )) AND (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t3] INNER JOIN [dbo].[DataEventAttributes] AS [t4] ON [t3].[DataEventID] = [t4].[DataEventID] WHERE ([t3].[DataEventID] = [t0].[DataEventID]) AND ([t4].[DataEventKeyID] <> @p2) AND ([t4].[ValueDouble] <> @p3) )) AND (EXISTS( SELECT NULL AS [EMPTY] FROM [dbo].[DataEvents] AS [t5] INNER JOIN [dbo].[DataEventAttributes] AS [t6] ON [t5].[DataEventID] = [t6].[DataEventID] WHERE ([t5].[DataEventID] = [t0].[DataEventID]) AND ([t6].[DataEventKeyID] <> @p4) AND ([t6].[ValueDouble] <> @p5) )) AND ([t0].[DataOwnerID] = @p6) }

Notice how all three internal queries are now all "<>" where the query above is not that.

Am I totally off my rocker here? Am I missing something so simple that when you tell me I am going to want to pull my fingernails off? I am actually hoping you tell me that, instead of telling me that it looks like a bug in the MS framework (well, we know that happens sometimes).

Any help is greatly appreciated. Maybe I should be sending dynamic query parts to the base query a different way. I am open to ideas.

+1  A: 

Without having fully evaluated your examples, the one thing that stands out is this:

foreach (var partQuery in whereStatements)
{
    finalQuery = finalQuery.Where(
        dataEvent => partQuery.Contains(dataEvent.DataEventID)
        );
}

Because of the way this loop is structured, every expression generated in each iteration will eventually use the final value of partQuery--the value that is present when the loop terminates. You probably want this, instead:

foreach (var partQuery in whereStatements)
{
    var part = partQuery;
    finalQuery = finalQuery.Where(
        dataEvent => part.Contains(dataEvent.DataEventID)
        );
}

Now, part is the captured variable, and is unique per iteration, and therefore unique per expression. This odd-at-first behavior is by design: see a related question.

Edited: It looks like this is exactly what's causing your problem; the subqueries in the final query are all of the form x <> y, which is the form of the last query added to your whereStatements collection.

Ben M
Damn, I will never ignore my warnings ever again. It has been underlined in VS, but I ignored it.. I thought the scope would have been the same. Thanks so much.. a second set of eyes seems to have resolved my issue :)
TravisWhidden