views:

154

answers:

7

I have this one gigantic complex LINQ to SQL query that I need to optimize somehow, because the background C# compiler completely hogs the CPU and I can't type or edit my .cs file normally in Visual Studio 2010 (every letter, especially if IntelliSense wants to pop up, lags horribly).

The culprit is this:

var custFVC = 
    (from cfvc in customer.CustomerFrameVariationCategories
    let lastValue = cfvc.CustomerFrameVariationCategoryValueChanges.Where(cfvcvc => cfvcvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(cfvcvc2 => cfvcvc2.ChangeDateTime).FirstOrDefault() ?? new CustomerFrameVariationCategoryValueChange()
    let lastValue2 = cfvc.FrameVariationCategory.FrameVariation.Frame.FrameValueChanges.Where(fvc => fvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fvc2 => fvc2.ChangeDateTime).FirstOrDefault() ?? new FrameValueChange()
    let lastValue3 = cfvc.FrameVariationCategory.FrameVariationCategoryValueChanges.Where(fvcvc => fvcvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fvcvc2 => fvcvc2.ChangeDateTime).FirstOrDefault() ?? new FrameVariationCategoryValueChange()
    let lastValue4 = cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Any(fvm => (fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleValueChange()).IsActive == false)
    where lastValue.IsActive == true
    orderby cfvc.FrameVariationCategory.FrameVariation.Frame.Name, cfvc.FrameVariationCategory.Category.Name, cfvc.FrameVariationCategory.FrameVariation.Name
    select new
    {
       cfvc.Id,
       cfvc.FrameVariationCategory,
       lastValue.CoverCoefficient,
       lastValue.NeiserNet,
       PlywoodName = lastValue2.Plywood.Name,
       FrameIsActive = lastValue2.IsActive,
       OwnCost = cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Sum(fvm => // sum all frame variation modules
          (lastValue4 ? 0 : fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime) // if module not active then 0
          .OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault().Porolone) + // otherwise get Porolone
           fvm.FrameModule.FrameModuleComponents.Sum(fmc => // add to Porolone sum of all module components
           (fmc.Article.ArticleDetails.Any() ? fmc.Article.ArticleDetails.Sum(ad => // if any article details then use A*L*W*T instead of Amount
           WindowExcel.MultiplyArticleDetailValues(ad.ArticleDetailValueChanges.Where(advc => advc.ChangeDateTime <= this.SelectedDateTime)
          .OrderByDescending(advc2 => advc2.ChangeDateTime).FirstOrDefault() ?? new ArticleDetailValueChange())) : 
           WindowExcel.GetModuleComponentAmount(fmc.FrameModuleComponentValueChanges.Where(fmcvc => fmcvc.ChangeDateTime <= this.SelectedDateTime) // no details = get amount
             .OrderByDescending(fmcvc2 => fmcvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleComponentValueChange())) * // times article values
           WindowExcel.MultiplyArticleValues(fmc.Article.ArticleValueChanges.Where(avc => avc.ChangeDateTime <= this.SelectedDateTime)
           .OrderByDescending(avc2 => avc2.ChangeDateTime).FirstOrDefault() ?? new ArticleValueChange()))),
       Cubes = cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Sum(fvm => (fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime && fmvc.IsActive == true).OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleValueChange()).Cubes),
       lastValue3.CoverNet,
       lastValue3.CoverGarbage,
       lastValue3.CoverGross,
       lastValue3.CoverPrice,
       lastValue3.BackgroundNet,
       lastValue3.BackgroundGarbage,
       lastValue3.BackgroundGross,
       lastValue3.BackgroundPrice,
       FVCIsActive = lastValue3.IsActive,
       FrameModuleAnyNonActive = lastValue4
    }).ToList();

The biggest problem here is OwnCost, everything up to and after that Visual Studio can handle. I don't want to turn off background compiling (a feature that checks for compile time errors before actually compiling), I don't want to create a stored procedure. I can't off this code into a separate class/method, because the LINQ DataContext can't be passed around (as far as I know - also take into consideration that the context variable is inside a using statement).

The only vague idea that I have, is some sort of an extension method, or a method that returns a LINQ query or something like that. Because I don't know what exactly it is that I can do here to rectify the problem, I don't know how to formulate the wording, thus I can't google it...

How can I move (or optimize) OwnCost or the entire query out of the current .cs file, or perhaps split it into a method within the same file (might help the background compiler), or "something"...?

A: 

Instead of writing LINQ to SQL you can write a stored procedure for this to do all these things.

kumar
We'll first of all, I've never written stored procedures, so I'd have to start learning that. Secondly I'd have to convert the above LINQ query to T-SQL statements, which is next to impossible (unless I take a look at the compiled LINQ query and work from there...). Thirdly I don't know how well a stored procedure result (an anonymous class in this case) will integrate with the rest of the code. All in all, too much work, there must be a simpler and faster way. The code itself works and is fine, I just need to optimize it for Visual Studio...
Marko
@kumar, this answer doesn't attempt to help the OP with their problem
JaredPar
I would argue that if you see someone who appears to be struggling because he's approaching a problem the wrong way, suggesting the right way to approach the problem is certainly an attempt to help.
Robert Rossney
@Marko With such a complex query it might be best to move it to a Stored Procedure. Also you can "inspect" the query in the visual studio debugger to see the generated T-SQL. You could use that to build your stored procedure. I personally love LINQ but there is still a time and a place for stored procedures.
antonlavey
+1  A: 

I don't have any inherent insight into this problem in terms of what is expensive in the C# compiler. However the two things that jump out when I look at your query are the following

  1. The number and complexity of the let bindings
  2. The complexity of the initializer of the OwnCost member inside the select clause

The best advice I can give is to try and break up the query in order to get these into separate statements and hopefully that will ease the pressure on the compiler.

JaredPar
The complexity is neccessary because of the complexity of the business logic. That's unavoidable. Dthorpe also suggested braking the query into seperate statements, you can read my comment that I left under his answer about the results.
Marko
+1  A: 

Split it up. That is one massive expression tree VS is trying to deal with. You could break it up so that some of the SELECT clause transformation happens in LINQ-to-object. This would be a lot easier for the background compiler to deal with. Just get:

var custFVC = (from cfvc in customer.CustomerFrameVariationCategories
               let lastValue = cfvc.CustomerFrameVariationCategoryValueChanges.Where(cfvcvc => cfvcvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(cfvcvc2 => cfvcvc2.ChangeDateTime).FirstOrDefault() ?? new CustomerFrameVariationCategoryValueChange()
               let lastValue2 = cfvc.FrameVariationCategory.FrameVariation.Frame.FrameValueChanges.Where(fvc => fvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fvc2 => fvc2.ChangeDateTime).FirstOrDefault() ?? new FrameValueChange()
               let lastValue3 = cfvc.FrameVariationCategory.FrameVariationCategoryValueChanges.Where(fvcvc => fvcvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fvcvc2 => fvcvc2.ChangeDateTime).FirstOrDefault() ?? new FrameVariationCategoryValueChange()
               let lastValue4 = cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Any(fvm => (fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime).OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleValueChange()).IsActive == false)
               where lastValue.IsActive == true
               orderby cfvc.FrameVariationCategory.FrameVariation.Frame.Name, cfvc.FrameVariationCategory.Category.Name, cfvc.FrameVariationCategory.FrameVariation.Name
               select new
               { cfvc, lastValue, lastValue1, lastValue2, lastValue3}).ToList();

And then do the rest of your manipulation from there. If the result set is small, this might be more efficient anyway, and certainly easier on your db. If the result set is small, doing this will have very little performance cost.

If you have a bored, underworked db and there's a large result set and the machine where this code is running is strained, then you might need to keep the workload on the db.

Keep in mind that just breaking up into several steps the building of one massive expression tree to be run against IQuerable will not do you any good. That last variable will be as complicated (under the hood) as yours is now, and the compiler will still choke. The bottom line is you need to run .ToList() earlier in the life of this manipulation. A series of LINQ-to-object queries against IEnumerable won't be difficult for the background compiler to handle.

Patrick Karcher
1) The result set will eventualy be very large. 2) What you suggested (.ToList() earliler) will never ever be easier for the db, because inorder for me to get OwnCost I would now have to make multiple roundtrips to the db. What dthorpe and JaredPar suggested is better, because that only splits the sql statement in half not it's execution.
Marko
3) I tried your method of having ToList earlier, like you showed it, and then calculating OwnCost on top of that, but VS still choked. Maybe what you say is true, that queries against IEnumerable won't be difficult for the compiler, but the OwnCost expression tree is still to complex for VS. Check out my comment that I left under dthorpe answer - it seems certain that I have to split the query or OwnCost into a seperate file alltogether, but how to accomplish that? I can't pass around the context variable since it is in the using statement, I cant return anonymous types from a method either...
Marko
+1  A: 

VS is probably choking on this because it's such a large, complex single statement.

Since the only linkages between the OwnCost and the LINQ context are the reference to cfvc and lastValue4, it seems to me that you could calculate OwnCost in a separate step after the initial LINQ query statement. Store the lastValue4 in the anonymous type constructed by the LINQ statement, remove OwnCost, and remove .ToList() from the end. You don't need to store the cfvc value since the only thing you're using it for is to access .FrameVariationCategory, which you've already captured in the 2nd field of the anonymous type.

In a separate statement, select from the CustFVC resultset to construct the OwnCost for each item to produce a new result set that contains all the data bits you're looking for. Call .ToList() on that second result set. This should produce equivalent results to the monster statement in similar time.

If this is a large result set, be careful about iterating over the data multiple times. If you use a foreach to calculate OwnCost for each item in the original result set, you will be running through the data twice - twice as much work as the single monster LINQ query.

If you use a LINQ query for the second operation, it shouldn't cause any additional passes over the data beyond what you already have - LINQ is lazy evaluated, so the next row is not actually retrieved until asked for. ToList() forces all rows to be retrieved. A foreach loop forces all rows to be retrieved. LINQ query using a LINQ query as an input does not iterate any rows of the input result set, it just piles on more conditions to be evaluated when somebody eventually asks for the next row of the 2nd result set.

dthorpe
I perfectly understand what you are saying and a very good idea in itself, but as it turns out even if I split OwnCost into a second statement, VS still chokes. The only option seems to be to get the entire query or OwnCost into a seperate file. I don't mind if VS chockes when I modify this one query when that "special" file is open, but I would like to work normally with the rest of my project... The only question now is what are my options or how to accomplish this?
Marko
+2  A: 

My first instinct as that you're trying to make LINQ to SQL do the work of a stored procedure. But that may be incorrect; it's pretty difficult to tell if it would even be possible for a stored procedure to do this.

My second instinct is that it should be possible to split the OwnCost calculation into a function, so that this query just contains

OwnCost = cfvc.Select(CalculateOwnCost)

My third instinct, on seeing that the calculation includes a WindowExcel object, is to flee screaming, but I'm going to take a couple of deep breaths and ask, are you in fact interoperating with Excel in the context of this query, and might that possibly be a source of problems?

Edit

To break out the OwnCost calculation into its own function, do something like this:

public decimal CalculateOwnCost(CustomerFrameVariationCategory cvfc)
{
   return cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Sum(fvm => // sum all frame variation modules
                 (lastValue4 ? 0 : fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime) // if module not active then 0
                 .OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault().Porolone) + // otherwise get Porolone
                  fvm.FrameModule.FrameModuleComponents.Sum(fmc => // add to Porolone sum of all module components
                  (fmc.Article.ArticleDetails.Any() ? fmc.Article.ArticleDetails.Sum(ad => // if any article details then use A*L*W*T instead of Amount
                  WindowExcel.MultiplyArticleDetailValues(ad.ArticleDetailValueChanges.Where(advc => advc.ChangeDateTime <= this.SelectedDateTime)
                 .OrderByDescending(advc2 => advc2.ChangeDateTime).FirstOrDefault() ?? new ArticleDetailValueChange())) : 
                  WindowExcel.GetModuleComponentAmount(fmc.FrameModuleComponentValueChanges.Where(fmcvc => fmcvc.ChangeDateTime <= this.SelectedDateTime) // no details = get amount
                 .OrderByDescending(fmcvc2 => fmcvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleComponentValueChange())) * // times article values
                  WindowExcel.MultiplyArticleValues(fmc.Article.ArticleValueChanges.Where(avc => avc.ChangeDateTime <= this.SelectedDateTime)
                  .OrderByDescending(avc2 => avc2.ChangeDateTime).FirstOrDefault() ?? new ArticleValueChange()))),
              Cubes = cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Sum(fvm => (fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= this.SelectedDateTime && fmvc.IsActive == true).OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleValueChange()).Cubes)
}

That assumes that CustomerFrameVariationCategories is a colleciton of CustomerFrameVariationCategory objects, and that OwnCost is a decimal.

Once you do this, your original query can just include the Select that I showed above - you can also write it as

OwnCost = cfvc.Select(x => CalculateOwnCost(x))

if it makes you more comfortable (me, I've gotten scolded by Resharper enough on this point that I've come to accept it, but it's a matter of taste).

There's no reason you can't further decompose some of the intermediate expressions in that query into their own functions. A lambda function is just a function, after all.

Robert Rossney
We'll your first instinct is debatable, because what is a stored procedure? - a method like any other. Though, if LINQ can do it, so then must a stored procedure be able to...Your second instinct = how would I go about creating such a function? Examples or links please! Third instinct = you are correct, I am outputing the result to Excel. But the problem is not in the execution of the code, that works fine (at the moment), the problem is Visual Studio can't handle the expression tree of OwnCost, so I can't write code normally in that file...
Marko
+1  A: 

I may be wrong here and it's just a guess, but have you tried splitting your class (and essentially your file) with the partial keyword?

redoced
I have thought about it, but haven't tried it. The reason being that if I go down that route and move the entire query into the second file with a partial class, obviouslly put it in a method, I have to create it's own DataContext as well (double traffic for db) since I can't pass it around. I don't need object tracking so in general I could destroy the using statement that houses the context and manually write a try-finally block, then I could ref the context and thus get this scenario to work. But this is not elegant at all, more of a last resort technique...
Marko
... I'm looking more for a linq extension method (not sure if this terminology applies here) or "something" like that. Best scenario for me would be to move the OwnCost expression tree elsewehere, yet keep the DataContext intact in it's current form. I'll keep your answer in mind, because currently it's the only viable option, not the best, but unless someone has a better idea... (forgot to add to the previous comment, I'd also have to manually create a wrapper class for the query result, because anonymous classes can't be returned. That's another big minus.)
Marko
A: 

Hah, I found a solution myself :)

Roberts instinct about a LINQ function got me googling. The results were not relevant to the matter at hand, but the little code I did stumble upon, got me thinking about a brute force attack method. Using redoced's idea of a partial class I finally wrote this piece of code in a seperate .cs file:

public partial class WindowExcel
{
    private static decimal GetOwnCost(CustomerFrameVariationCategory cfvc, bool frameModuleAnyNonActive, DateTime selectedDateTime)
    {
        return cfvc.FrameVariationCategory.FrameVariation.FrameVariationModules.Sum(fvm => // sum all frame variation modules
            (frameModuleAnyNonActive ? 0 : fvm.FrameModule.FrameModuleValueChanges.Where(fmvc => fmvc.ChangeDateTime <= selectedDateTime) // if module not active then 0
            .OrderByDescending(fmvc2 => fmvc2.ChangeDateTime).FirstOrDefault().Porolone) + // otherwise get Porolone
            fvm.FrameModule.FrameModuleComponents.Sum(fmc => // add to Porolone sum of all module components
                (fmc.Article.ArticleDetails.Any() ? fmc.Article.ArticleDetails.Sum(ad => // if any article details then use A*L*W*T instead of Amount
                    WindowExcel.MultiplyArticleDetailValues(ad.ArticleDetailValueChanges.Where(advc => advc.ChangeDateTime <= selectedDateTime)
                    .OrderByDescending(advc2 => advc2.ChangeDateTime).FirstOrDefault() ?? new ArticleDetailValueChange())) :
                    WindowExcel.GetModuleComponentAmount(fmc.FrameModuleComponentValueChanges.Where(fmcvc => fmcvc.ChangeDateTime <= selectedDateTime) // no details = get amount
                    .OrderByDescending(fmcvc2 => fmcvc2.ChangeDateTime).FirstOrDefault() ?? new FrameModuleComponentValueChange())) * // times article values
                    WindowExcel.MultiplyArticleValues(fmc.Article.ArticleValueChanges.Where(avc => avc.ChangeDateTime <= selectedDateTime)
                    .OrderByDescending(avc2 => avc2.ChangeDateTime).FirstOrDefault() ?? new ArticleValueChange())));
    }
}

And in my gigantic LINQ query I rewrote OwnCost as such:

OwnCost = WindowExcel.GetOwnCost(cfvc, lastValue4, this.SelectedDateTime)

Editing the GetOwnCost method is still painfully slow, as was excepted, but at least the rest of my project is now usable. I'm not sure what this brute force seperation does to performance. The fact that I can't ref the CustomerFrameVariationCategory and that the OwnCost expression tree is inside of a method not in a LINQ query itself, raises questions. Guess I'll have to profile it at some point, but that's a hole other issue.

Now to the delicate issue of what to mark as the answer. Though I do appreciate all the input, none of the answers so far were correct (no concrete solution), thus I'll have to mark my own post as the answer. But I will vote for redoced's and Robert's answers for pointing me in the right direction.

I would appreciate, if anyone can comment about possible code execution performance impacts for my solution vs the original code.

PS! Writing this in Internet Explorer 8 is again painfully slow because of the constant CPU hogging (has something to do with coloring the code). So it's not only a VS issue....

Edit:

It seems Robert has managed to post the exact same solution I came up with. Would have probably got my answer posted earlier if not for the constant CPU hogging...

In all fairness I marked Robert's post as the answer :)

Marko