ansaurus

Question

Should LINQ be avoided because it's slow?

Answer 1

+10 A:

Is LINQ a real-world bottleneck (either effecting the overall or perceived performance of the application)?

Will your application be performing this operation on 1,000,000,000+ records in the real-world? If so--then you might want to consider alternatives--if not then it's like saying "we can't buy this family sedan because it doesn't drive well at 180+ MPH".

If it's "just slow" then that's not a very good reason... by that reasoning you should be writing everything in asm/C/C++, and C# should be off the table for being "too slow".

STW 2010-09-22 14:02:26

there isnt a place in the application that has a performance issue but it more of a idea that it could be if for whatever reason the code is executed a billion times like the sample and given that , it shouldnt be used.

2010-09-22 14:04:32

What's the deal with the wishy-washiness? You're the one that knows what your code is doing and how it's doing it. If LINQ is slow at doing it, then don't use LINQ. If there's no performance issue, then it doesn't matter, does it?

mquander 2010-09-22 14:05:36

@Fredrik, based on the fist sentence in his question I'd say this isn't his decision and he's trying to find some valid arguments for using LINQ. Doesn't seem wishy washy at all...

Abe Miessler 2010-09-22 14:14:00

@user455095: Does the code have a performance issue? If not, then everything in it is running fast enough, and it's not worthwhile to change something just because it's slow. If the code does have a performance issue, profile to see if the LINQ call is having a significant effect. If it is, test it both ways. If LINQ is more readable, faster to write, easier to get right, easier to maintain, or something like that, it's likely worth some performance hit.

David Thornley 2010-09-22 14:15:51

there isnt performance issues but if there was ,instead of trying to tune any linq statement it was decided we should just not use linq given a standard foreach work faster even though its a few more lines of code and you dont have to worry about tuning linq code.

2010-09-22 14:20:18

+1 for the image of a family sedan driving at 180 miles an hour.

Sean Vieira 2010-09-22 14:21:49

My response would be that it will also take 4944ms to do 3426195426 compares with non-LINQ code so on that basis you shouldn't use non-LINQ code either ...

MikeJ-UK 2010-09-22 14:33:48

@MikeJ-UK -- or that the *machine* is clearly too slow. Throw some hardware at it and overclock as needed to maintain a consistent throughput

STW 2010-09-22 14:37:25

@user455095: I'm sorry to say that your arguments about some way-off future scenario that might cause some negligible performance issue hold no water. Listen to these people; they know what they are talking about. Eric Lippert works on the team that builds your C# compiler, in case you were not aware. Jon Skeet works for Google. Both of these gentlemen and others here are trying to tell you something. L I S T E N.

James Dunne 2010-09-23 05:04:33

Answer 2

+2 A:

Maybe linq is slow, but with linq i can parallelize my code very simple.

Like this:

lst1.Cast<MyLinqTestClass1>().AsParallel().Any(item => item.Name == "9999");

How you would parallelize cycle?

gandjustas 2010-09-22 14:07:08

actually they was to not allow parrallel linq too since its linq it could suffer from the same performance issues.

2010-09-22 14:12:37

@user455095 - Except in many circumstances parallel LINQ could actually greatly **speed up** the processing compared to non-LINQ. Sure, you can implement threading yourself and end up with 20 lines of code instead of one.

Nelson 2010-09-22 14:51:46

@user455095: Linear execution of fast code is almost certianly to be **slower** than parallel execution of *slightly* slower code

ck 2010-09-22 15:04:55

I wouldn't agree entirely with @ck, as there are times when parallel execution gains little (when something forces it to not really be parallel), but in general it's true. So, @user455095 you aren't allowed to use a more efficient method because one part of it is less efficient than one part of the less efficient method? Sorry, that's not optimisation, that's just superstition.

Jon Hanna 2010-09-22 15:15:56

Answer 3

+83 A:

Why are you using Cast<T>()? You haven't given us enough code to really judge the benchmark, basically.

Yes, you can use LINQ to write slow code. Guess what? You can write slow non-LINQ code, too.

LINQ greatly aids expressiveness of code dealing with data... and it's not that hard to write code which performs well, so long as you take the time to understand LINQ to start with.

If anyone told me not to use LINQ (especially LINQ to Objects) for perceived reasons of speed I would laugh in their face. If they came up with a specific bottleneck and said, "We can make this faster by not using LINQ in this situation, and here's the evidence" then that's a very different matter.

Jon Skeet 2010-09-22 14:08:08

yes, I gave the reply that taking out the cast would help but was told that it wasnt as slow but still slow. Our application need to be fast so I guess the idea is that anything that could be slow should be avoided but Im not sure if thats a valid assumption

2010-09-22 14:11:38

@user455095: No, it's absolutely not a valid assumption. "Slow" is far from a precise term - and I very much doubt that this is a realistic benchmark to start with.

Jon Skeet 2010-09-22 14:13:14

The only time I ran into a case where I removed LINQ for performance reasons was for a a routine implementing AI in a game. This particular method was executed extremely frequently within a deep inner loop. The main impact I found wasn't actually LINQ, but rather the difference between indexing through an array directly versus indexing through an enumerator (my first attempt to improve was using foreach directly, which was of less benefit than switching to a classic for loop.) I only made this change because profiling identified that the code was spending 40% of its time here.

Dan Bryant 2010-09-22 14:43:51

@user, is `lst1` an `Arraylist`, per chance? (Could explain the `Cast<>`)

Anthony Pegram 2010-09-22 14:47:37

"If anyone told me not to use LINQ (especially LINQ to Objects) for perceived reasons of speed I would laugh in their face." I've actually laughed in a few faces for the same reason. Also, for when they say, "I don't see the value".

Richard Hein 2010-09-22 15:18:41

"Our application need to be fast so I guess the idea is that anything that could be slow should be avoided" - that's the dumbest thing I've ever heard. Strings can be slow. Arrays can be slow. virtual methods can be slow, interfaces can be slow. Anything can be slow, if you use it wrong, but that doesn't mean you should avoid all those things *everywhere* in your code.

nikie 2010-09-22 15:19:32

+1 for "If anyone told me not to use LINQ (especially LINQ to Objects) for perceived reasons of speed I would laugh in their face."

eglasius 2010-09-22 17:19:40

@Dan - **DONT SAY THAT!!!** The next question will be "I stopped using `foreach` loops because they're slower than `for` loops, I need a `for` loop with basic transactions (break the loop if the collection changes)"

STW 2010-09-22 17:21:25

@STW, though I haven't measured it, my guess is that the significant improvement was due to the fact that it was directly accessing an array, which is given special treatment in IL and likely jits to very efficient code when indexing. I still had to do much more aggressive search tree pruning (use a better algorithm), as the sheer number of iterations was the biggest problem. There's little point in improving the efficiency at which you can do unnecessary work.

Dan Bryant 2010-09-22 18:34:34

+1 for "If anyone told me not to use LINQ (especially LINQ to Objects) for perceived reasons of speed I would laugh in their face". I wish you worked in my company Jon Skeet, I really do! :-)

DoctaJonez 2010-09-23 20:48:43

I'm still laughing at that quote now! I'd give another +1 if I could :-)

DoctaJonez 2010-10-15 01:25:19

Answer 4

A:

As you demonstrated, it is possible to write non-LINQ code that performs better than LINQ code. But the reverse is also possible. Given the maintenance advantage that LINQ can provide, you might consider defaulting to LINQ as it is unlikely you will run into any performance bottlenecks that can be attributed to LINQ.

That said, there are some scenarios where LINQ just won't work. For example, if you are importing a ton of data, you might find that the act of executing individual inserts is slower than sending the data to SQL Server in batches of XML. In this example, it's not that the LINQ insert is faster than the non-LINQ insert, rather it's not prodent to execute individual SQL inserts for bulk data imports.

Mayo 2010-09-22 14:19:37

Answer 5

+86 A:

Should Linq be avoided because its slow?

No. It should be avoided if it is not fast enough. Slow and not fast enough are not at all the same thing!

Slow is irrelevant to your customers, your management and your stakeholders. Not fast enough is extremely relevant. Never measure how fast something is; that tells you nothing that you can use to base a business decision on. Measure how close to being acceptable to the customer it is. If it is acceptable then stop spending money on making it faster; it's already good enough.

Performance optimization is expensive. Writing code so that it can be read and maintained by others is expensive. Those goals are frequently in opposition to each other, so in order to spend your stakeholder's money responsibly you've got to ensure that you're only spending valuable time and effort doing performance optimizations on things that are not fast enough.

You've found an artificial, unrealistic benchmark situation where LINQ code is slower than some other way of writing the code. I assure you that your customers care not a bit about the speed of your unrealistic benchmark. They only care if the program you're shipping to them is too slow for them. And I assure you, your management cares not a bit about that (if they're competent); they care about how much money you're spending needlessly to make stuff that is fast enough unnoticably faster, and making the code more expensive to read, understand, and maintain in the process.

Eric Lippert 2010-09-22 14:21:40

its more of an issue of what if, and trying to avoid things that could possibly cause problems in the future and I guess given that linq is another layer that we dont know how it will react and could cause performance problems we have to deal with in the future and its better to avoid them now and not have to deal with the cost and time of dealing with them in the furture. I think its hard to predict the future but I guess some can.

2010-09-22 14:30:24

+1 for sage wisdom, and a good one-liner to use as a proverbial slap for anyone insisting on marginal performance taking priority over code sanity

STW 2010-09-22 14:40:20

@user455095, it's not about predicting the future. It's about dealing with it when it comes. You can write the app faster with linq. If it runs fast enough, then who cares if linq is a touch slower than an esoteric, convoluted solution. If in the future, you deem a linq query to be slow, you optimize that linq query, or change that query to something faster. You don't remove linq from the entire app because it was slower in one spot.

Chad 2010-09-22 14:44:00

@user455095: So what you're saying is that the devil you know is better than the devil you don't. Which is fine, *if you like making business decisions on the basis of old sayings*. I think it is generally a better idea to make business decisions based on *informed opinions* derived from *empirical measurements*. You're going in the right direction here by making empirical measurements, so that's good. But you're comparing that measurement against the wrong thing. The things you should be measuring are customer satisfaction and shareholder cost, not microseconds per comparison.

Eric Lippert 2010-09-22 14:46:13

Well we have been in development over 3 years and it will probably be another year before customers see it. I guess its more of a question that given its a huge product with a massive amount of code we need to try and eliminate anything that could cause performance problems given if it does become an issue then it could be costly and time consuming to find. Again it come to predicting and controling the future.

2010-09-22 14:52:33

@user: How about instead of "eliminating anything that could cause performance problems", you start with code that's correct and easy to maintain, and then eliminating *specific* performance problems as they come (which should be very easy if your software is well-architected and well-designed)? I hate to be this blunt, but you're basically announcing to the world here that you don't know how to do your job. Part of your "problem" seems to be that the code is so "massive" - well, guess what, it wouldn't be so massive if you were using LINQ effectively.

Aaronaught 2010-09-22 15:00:04

@user, I guarantee you there will be examples when LINQ will perform faster than code you would write yourself. Not because you couldn't write code that's faster than LINQ, it's just that you wouldn't be given the time to do so. You're spending all your time trying to meet the specification, integrate business rules that often contradict one another, find and squash bugs, etc. Meanwhile, Microsoft has a team of people optimizing the heck out of a `Join` operation. With LINQ, you're going to write more expressive code and you will write it faster. And it will perform well, too.

Anthony Pegram 2010-09-22 15:01:38

+1 for clear, brief and practical explanation.

tia 2010-09-22 15:05:41

@user455095: It sounds like you are having some anxiety about future performance. The way to mitigate that anxiety is to institute a policy of nightly automated performance tests that measure the performance of the *real* product against *real* customer metrics. That way every single day you can track the trend of your performance numbers. If you see that there is a sudden spike in bad performance then you can examine all the change logs for that day and figure out what change caused the bad performance. The expense of fixing a perf problem rises the longer it is there without you knowing!

Eric Lippert 2010-09-22 15:21:59

ya, I actaully forgot about that, we do have performance tests and if any of those test times change by a certain percentage then an alert will be raised so if any performance problems do come up then they would be caught there but thats something to thing about if anyone has these types of performance worries.

2010-09-22 15:27:28

@user455095: Then why all the worry? Try using LINQ today and *tomorrow you will know whether it caused an unacceptably high performance degradation*. You say that the uncertainty of using LINQ is too high; well, you're not going to become more certain until you spend some time using it and learn what works for you and what doesn't. If your question really is "should LINQ be avoided because my team doesn't understand how to use it effectively?" then that's a different question than the one you asked. My answer to that question would be "no; rather, learn how to use it effectively!"

Eric Lippert 2010-09-22 16:09:07

unfortunatly if your team lead says you cant use it then its kind of hard but Ill pass him this thread at any rate.

2010-09-22 16:43:17

Answer 6

+3 A:

While premature pessimization is (imho) as bad as premature optimization, you shouldn't rule out an entire technology based on absolute speed without taking usage context into consideration. Yes, if you're doing some really heavy number-crunching and this is a bottleneck, LINQ could be problematic - profile it.

An argument you could use in favour of LINQ is that, while you can probably outperform it with handwritten code, the LINQ version could likely be clearer and easier to maintain - plus, there's the advantage of PLINQ compared to complex manual parallelization.

snemarch 2010-09-22 14:25:20

+1 for "premature pessimization"

Lance Roberts 2010-09-22 15:44:51

Answer 7

+3 A:

The problem with this sort of comparison, is that its meaningless in the abstract.

One could beat either of those if one got to start by hashing the MyLinqTestClass1 objects by their Name property. In between those if one could sort them by Name and later do a binary search. Indeed, we don't need to store the MyLinqTestClass1 objects for that, we just need to store the names.

Memory size a problem? Maybe store the names in a DAWG structure, combine suffices and then use that for this check?

Does the extra overhead in setting these data structures up make any sense? It's impossible to tell.

A further matter is a different problem with the concept of LINQ, which is its name. It's great for marketing purposes for MS to be able to say "here's a bunch of cool new stuff that works together" but less good when it comes to people combining stuff together when they are doing the sort of analysis where they should be pulling them apart. You've to a call to Any that basically implements the filter-on-enumerable pattern common in .NET2.0 days (and not unknown with .NET1.1 though it being more awkward to write meant it was only used where its efficiency benefits in certain cases really mattered), you've got lambda expressions and you've got query trees all bunged together in one concept. Which is the slow one?

I'd bet the answer here is the lambda and not the use of Any, but I wouldn't bet a large amount (e.g. the success of a project), I'd test and be sure. Meanwhile, the way lambda expressions work with IQueryable can make for particularly efficient code that it would be extremely difficult to write with equivalent efficiency without the use of lambdas.

Do we not get to be efficient when LINQ is good at efficiency because it failed an artificial benchmark? I don't think so.

Use LINQ where it makes sense.

In bottleneck conditions, then move away or to LINQ despite it seeming appropriate or inappropriate as an optimisation. Don't write hard to understand code first go, as you'll just make real optimisation harder.

Jon Hanna 2010-09-22 14:32:34

It reminds me of Juval Lowy's "Every class as a WCF service" talk. He goes on a well-founded tirade about how using this type of scenario (using a raw-loop) to compare performance is not only meaningless, but often produces wrong results when compared to real-world measurements.

STW 2010-09-22 14:55:30

Answer 8

A:

There are a thousand times better reasons to avoid Linq.

Following quote from a discussion on Linq names a few of them:

QUOTE1

"For instance this works:

var a = new { x = 1, y = 2 }; a = new { x = 1, y = 3 };

But this does not work:

var a = new { x = 1, y = 2 }; a = new { x = 1, y = 2147483649 };

It returns : Error 1 Cannot implicitly convert type 'AnonymousType#1' to 'AnonymousType#2'

But this works:

var a = new { x = 1, y = 2147483648 }; a = new { x = 1, y = 2147483649 };

When you compile:

var a = new { x = 1, y = 2 };

The type of the x and y components is arbitrarily declared as a 32 bit signed integer, and it is one of the many integer types the platform has, without anything special.

But there is more. For instance this works:

double x = 1.0; x = 1;

But this does not work: var a = new { x = 1.0, y = 0 }; a = new { x = 1, y = 0 }; The numeric conversion rules are not applicable to this kind of types. As you can see, elegance is in every detail."

QUOTE2

"It appears, then, that 'AnonymousType#1' and 'AnonymousType#2' are not synonymous--they name distinct types. And as { x = 1, y = 2 } and { y = 2, x = 1 } are expressions of those two types, respectively, not only do they denote distinct values, but alos values of distinct types.

So, I was right to be paranoid. Now my paranoia extends even further and I have to ask what LinQ makes of the following comparison:

new { x = 1, y = 2 } == new { x = 1, y = 2 }

The result is false because this is a pointer comparison.

But the result of:

(new { x = 1, y = 2 }).Equals(new { x = 1, y = 2 })

Is true.

And the result of:

(new { x = 1, y = 2 }).Equals(new { y = 2, x = 1 })

and

(new { x = 1, y = 2 }).Equals(new { a = 1, b = 2 })

Is false."

QUOTE3

"updates are record oriented :-O

This, I agree, is problematic, and derives from LINQ's sequence-oriented nature.

This is a show stopper for me. If I have to use SQL for my updates anyway why to bother about LinQ?

the optimization in LinQ to objects is unexistent.

There is not any algebraic optimization nor automatic expression rewrite. Many people don't want to use LinQ to Objects because they lose a lot of performance. Queries are executed in the same way as you write them."

Erwin Smout 2010-09-22 14:40:09

What does this have to do with LINQ? Sure, anonymous types are used in LINQ, but they are not LINQ.

Nelson 2010-09-22 14:54:23

Aaronaught 2010-09-22 15:01:52

Your reasons are only valid if you don't know what you are doing. Experienced programmers don't / won't experience these problems as they will know how their types work.

ck 2010-09-22 15:07:37

Apparently it's not possible to get negative reputation. Does SO have an "overdraft" feature so all these negative votes actually count in the future? :)

Nelson 2010-09-22 19:41:16

Answer 9

+1 A:

To me, this sounds like you're working on a contract and the employer either doesn't understand LINQ, or doesn't understand the performance bottlenecks of the system. If you're writing an applicaiton with a GUI, the minor performance impact of using LINQ is negligible. In a typical GUI/Web app, in-memory calls make up less than 1% of all wait time. You, or rather your employer, is trying to optimize that 1%. Is that really beneficial?

However, if you are writing an application that is scientific or heavily math oriented, with very little disk or database access, then I'd agree that LINQ is not the way to go.

BTW, the cast is not needed. The following is functionally equivalent to your first test:

       for (int i = 0; i < 10000; i++)
            isInGroup = lst1.Any(item => item.Name == "9999");

When I ran this using a test list containing 10,000 MyLinqTestClass1 objects, the original ran in 2.79 seconds, the revised in 3.43 seconds. Saving 30% on operations that likely take up less than 1% percent of CPU time is not a good use of your time.

Jess 2010-09-22 14:40:12

it actually is scientific and heavy on math and given that maybe it should be avoided in those situations but should it be applied everywhere. I guess you could make the case that you may need to do a 1 billion time loop anywhere.

2010-09-22 14:46:00

@user455095 -- you mean, the case could be made that *taking an extra 2.5 seconds to process 1 billion items would be unacceptable*. If they're concerned with this level of misinformation then you might want to pull the rug out from under them; take a look at using F#--if this is scientific/mathematical then F# might offer significant advantages

STW 2010-09-22 14:54:10

@user455095 - I agree with STW. If performance is that critical, it would benefit them to ditch C# altogether for the heavy math routines and use the GPU or at least straight C.

Jess 2010-09-22 15:00:01

we actually have a lot of our heavy scientific and math routine in C++ and even fortran .

2010-09-22 15:05:22

@user455095 - LINQ sometimes makes it easier to read code, and sometimes makes it faster to write. If the client is willing to pay more for you to write your code, then by all means drop LINQ. In my experience, short LINQ statements are a huge benefit, while long/complex ones make it harder to understand code.

Jess 2010-09-22 15:16:47

Answer 10

A:

Yes, you're right. It's easy to write slow code in LINQ. The others are right, too: it's easy to write slow code in C# without LINQ.

I wrote the same loop as you in C and it ran some number of milliseconds faster. The conclusion I draw from this is that C# itself is slow.

As with your LINQ->loop expansion, in C it will take more than 5 times as many lines of code to do the same thing, making it slower to write, harder to read, more likely to have bugs, and tougher to find and fix them, but if saving a few milliseconds for every billion iterations is important, that's often what it takes.

Ken 2010-09-22 14:47:32

Answer 11

A:

I would rather say you should avoid trying too hard to write most efficient code, except it is mandatory.

tia 2010-09-22 15:07:24

Answer 12

+42 A:

Maybe I've missed something, but I'm pretty sure your benchmarks are off.

I tested with the following methods:

The Any extension method ("LINQ")
A simple foreach loop (your "optimized" method)
Using the ICollection.Contains method
The Any extension method using an optimized data structure (HashSet<T>)

Here is the test code:

class Program
{
    static void Main(string[] args)
    {
        var names = Enumerable.Range(1, 10000).Select(i => i.ToString()).ToList();
        var namesHash = new HashSet<string>(names);
        string testName = "9999";
        for (int i = 0; i < 10; i++)
        {
            Profiler.ReportRunningTimes(new Dictionary<string, Action>() 
            {
                { "Enumerable.Any", () => ExecuteContains(names, testName, ContainsAny) },
                { "ICollection.Contains", () => ExecuteContains(names, testName, ContainsCollection) },
                { "Foreach Loop", () => ExecuteContains(names, testName, ContainsLoop) },
                { "HashSet", () => ExecuteContains(namesHash, testName, ContainsCollection) }
            },
            (s, ts) => Console.WriteLine("{0, 20}: {1}", s, ts), 10000);
            Console.WriteLine();
        }
        Console.ReadLine();
    }

    static bool ContainsAny(ICollection<string> names, string name)
    {
        return names.Any(s => s == name);
    }

    static bool ContainsCollection(ICollection<string> names, string name)
    {
        return names.Contains(name);
    }

    static bool ContainsLoop(ICollection<string> names, string name)
    {
        foreach (var currentName in names)
        {
            if (currentName == name)
                return true;
        }
        return false;
    }

    static void ExecuteContains(ICollection<string> names, string name,
        Func<ICollection<string>, string, bool> containsFunc)
    {
        if (containsFunc(names, name))
            Trace.WriteLine("Found element in list.");
    }
}

Don't worry about the internals of the Profiler class. It just runs the Action in a loop and uses a Stopwatch to time it. It also makes sure to call GC.Collect() before each test to eliminate as much noise as possible.

Here were the results:

      Enumerable.Any: 00:00:03.4228475
ICollection.Contains: 00:00:01.5884240
        Foreach Loop: 00:00:03.0360391
             HashSet: 00:00:00.0016518

      Enumerable.Any: 00:00:03.4037930
ICollection.Contains: 00:00:01.5918984
        Foreach Loop: 00:00:03.0306881
             HashSet: 00:00:00.0010133

      Enumerable.Any: 00:00:03.4148203
ICollection.Contains: 00:00:01.5855388
        Foreach Loop: 00:00:03.0279685
             HashSet: 00:00:00.0010481

      Enumerable.Any: 00:00:03.4101247
ICollection.Contains: 00:00:01.5842384
        Foreach Loop: 00:00:03.0234608
             HashSet: 00:00:00.0010258

      Enumerable.Any: 00:00:03.4018359
ICollection.Contains: 00:00:01.5902487
        Foreach Loop: 00:00:03.0312421
             HashSet: 00:00:00.0010222

The data is very consistent and tells the following story:

Naïvely using the Any extension method is about 9% slower than naïvely using a foreach loop.
Using the most appropriate method (ICollection<string>.Contains) with an unoptimized data structure (List<string>) is approximately 50% faster than naïvely using a foreach loop.
Using an optimized data structure (HashSet<string>) completely blows any of the other methods out of the water in performance terms.

I have no idea where you got 243% from. My guess is it has something to do with all that casting. If you're using an ArrayList then not only are you using an unoptimized data structure, you're using a largely obsolete data structure.

I can predict what comes next. "Yeah, I know you can optimize it better, but this was just an example to compare the performance of LINQ vs. non-LINQ."

Yeah, but if you couldn't even be thorough in your example, how can you possibly expect to be this thorough in production code?

The bottom line is this:

How you architect and design your software is exponentially more important than what specific tools you use and when.

If you run into performance bottlenecks - which is every bit as likely to happen with LINQ vs. without - then solve them. Eric's suggestion of automated performance tests is an excellent one; that will help you to identify the problems early so that you can solve them properly - not by shunning an amazing tool that makes you 80% more productive but happens to incur a < 10% performance penalty, but by actually investigating the issue and coming up with a real solution that can boost your performance by a factor of 2, or 10, or 100 or more.

Creating high-performance applications is not about using the right libraries. It's about profiling, making good design choices, and writing good code.

Aaronaught 2010-09-22 16:01:17

I forgot to mention, this was compiled with the `TRACE` flag **off**, so no overhead is being incurred there for this test.

Aaronaught 2010-09-22 16:14:25

+1 for actual measurements, including a data structure comparison.

Daniel Pryden 2010-09-22 16:27:53

Answer 13

+2 A:

Here's an interesting observation, since you mention nHibernate being slow as a consequence of LINQ being slow. If you're doing LINQ to SQL (or the nHibernate equivalent), then your LINQ code translates to an EXISTS query on the SQL server where as your loop code must first fetch all rows, then iterate over them. Now, you could easily write such a test so that the loop code reads all the data once (single DB lookup) for all 10K runs but the LINQ code actually performs 10K SQL queries. That's probably going to show a big speed advantage for the loop-version which doesn't exist in reality. In reality a single EXISTS query is going to outperform the table scan and loop every time -- even if you don't have an index on the column being queried (which you probably would if this query is done very often).

I'm not saying that it is the case with your test -- we don't have enough code to see -- but it could be. It could also be that there really is a performance difference with LINQ to Objects, but that may not translate to LINQ to SQL at all. You need to know what you're measuring and how applicable it is to your real world needs.

tvanfosson 2010-09-22 16:04:57

+1, good catch on an important point not addressed in other answers. OP doesn't understand that varying implementations will have varying performance characteristics and considerations.

David B 2010-09-22 17:04:56

Answer 14

+1 A:

"I've had been told [by whom?] that since .net linq is so slow [for what?] we shouldn't use it"

In my experience, basing decisions such as what technique, library or language to use solely on what someone has once told you is a bad idea.

First of all, does the information come from a source you trust? If not, you might be making a huge mistake trusting this (perhaps unknown) person to make your design decisions. Secondly, is this information still relevant today? But okay, based on your simple and not very realistic benchmark, you've concluded that LINQ is slower than manually performing the same operation. The natural question to ask yourself is this: is this code performance critical? Will the performance of this code be limited by other factors than the execution speed of my LINQ query -- think database queries, waiting on I/O, etc?

Here's how I like to work:

Identify the problem to be solved, and write the simplest feature-complete solution given the requirements and limitations you already know of
Determine whether your implementation actually fulfills the requirements (is it fast enough? Is the resource consumption kept at an acceptable level?).
If it does, you're done. If not, look for ways to optimize and refine your solution until it passes the test at #2. This is where you may need to consider giving up on something because it's too slow. Maybe. Chances are, though, that the bottleneck isn't where you expected it to be at all.

To me, this simple method serves a single purpose: maximizing my productivity by minimizing the time I spend improving code that is already perfectly adequate.

Yes, the day might come when you find that your original solution doesn't cut it any more. Or it might not. If it does, deal with it then and there. I suggest you avoid wasting your time trying to solve hypothetical (future) problems.

Martin Törnwall 2010-09-22 16:12:57

well, first is isnt a recomendation, it is more of a dictated rule . If you can use a foreach rather than a linq statement then why not use it and not take the risk that you are doing a linq statement 10000 times since foreach could be considered less complex(not my words). I guess this thread is changing more into what is good programming practice and that tends to be more subjective. If you get 10 programmers in a room and ask their opintion, what would you get? 10 different opinions..... and a lot of arguing.. Im starting to sound a little cynical.

2010-09-22 17:54:35

@user: *"If you can use a `foreach` rather than a LINQ statement then why not use [the `foreach` statement] and not take the risk [...]?"* - my reason not to use the `foreach` statement is that the risk of a performance problem is vanishingly small when weighed against the risk that a bug will eventually be introduced due to awkwardly-written imperative code. It's easy to write a fast program that doesn't work. It's also harder to fix the fast-but-wrong program than it is to speed up the slow-but-correct program.

Aaronaught 2010-09-22 20:04:38

@user455095: I'm not aware of a "dictated rule" that says code should be written with execution speed as the primary point of consideration. There are certainly *specific* cases where execution speed should be favored over clarity and simplicity (think games, a chess engine, compression code, etc). In my experience, though, you'll greatly increase your chances of shipping a quality product by focusing on functionality rather than obsessing over speed.@Aaronaught: very well put -- you've nicely expressed the point I was trying to make.

Martin Törnwall 2010-09-22 20:22:16

Answer 15

A:

Given that LINQ is slow then it would also follow that PLINQ is slow and NHibernate LINQ would be slow so any kind on LINQ statement should not be used.

That's a Way different context, but incredibly different. A 1.4 vs. 5 secs for the whole 1 billion of operations are irrelevant when you are talking about data access operations.

eglasius 2010-09-22 17:24:47

Answer 16

A:

Your test case is a little skewed. The ANY operater will begin enumerating through your results and return true on the first instance if finds and quit. Try this with simple lists of strings to see the result. To answer your question about avoiding LINQ, you should really transition toward the use of LINQ. It makes code easier to read when doing complex queries in addition to compile time checking. Also you don't need to use the Cast operator in your example.

string compareMe = "Success";
string notEqual = "Not Success";

List<string> headOfList = new List<string>();
List<string> midOfList = new List<string>();
List<string> endOfList = new List<string>();

//Create a list of 999,999 items
List<string> masterList = new List<string>();
masterList.AddRange(Enumerable.Repeat(notEqual, 999999));

//put the true case at the head of the list
headOfList.Add(compareMe);
headOfList.AddRange(masterList);

//insert the true case in the middle of the list
midOfList.AddRange(masterList);
midOfList.Insert(masterList.Count/2, compareMe);

//insert the true case at the tail of the list
endOfList.AddRange(masterList);
endOfList.Add(compareMe);


Stopwatch stopWatch = new Stopwatch();

stopWatch.Start();
headOfList.Any(p=>p == compareMe);
stopWatch.ElapsedMilliseconds.Dump();
stopWatch.Reset();

stopWatch.Start();
midOfList.Any(p=>p == compareMe);
stopWatch.ElapsedMilliseconds.Dump();
stopWatch.Reset();

stopWatch.Start();
endOfList.Any(p=>p == compareMe);
stopWatch.ElapsedMilliseconds.Dump();
stopWatch.Stop();

EC182 2010-09-22 17:33:12

Answer 17

A:

The type casting is of course going to slow your code down. If you care that much, at least used a strongly typed IEnumerable for the comparison. I myself try to use LINQ wherever possible. It makes your code much more concise. It's not not to have to worry about the imperative details of your code. LINQ is a functional concept, which means you'll spell out what you want to happen and not worry about how.

Antwan W. A-Dubb 2010-09-22 21:02:23

ansaurus

tags:

views:

answers:

Should LINQ be avoided because it's slow?

How you architect and design your software is exponentially more important than what specific tools you use and when.

related questions