views:

1675

answers:

6

Update - for those of a facetious frame of mind, you can assume that Aggregate still produces the normal result whatever function is passed to it, including in the case being optimized.

I wrote this program to build a long string of integers from 0 to 19999 separate by commas.

using System;
using System.Linq;
using System.Diagnostics;

namespace ConsoleApplication5
{
    class Program
    {
        static void Main(string[] args)
        {
            const int size = 20000;

            Stopwatch stopwatch = new Stopwatch();

            stopwatch.Start();
            Enumerable.Range(0, size).Select(n => n.ToString()).Aggregate((a, b) => a + ", " + b);
            stopwatch.Stop();

            Console.WriteLine(stopwatch.ElapsedMilliseconds + "ms");
        }
    }
}

When I run it, it says:

5116ms

Over five seconds, terrible. Of course it's because the whole string is being copied each time around the loop.

But what if make one very small change indicated by the comment?

using System;
using System.Linq;
using System.Diagnostics;

namespace ConsoleApplication5
{
    using MakeAggregateGoFaster;  // <---- inserted this

    class Program
    {
        static void Main(string[] args)
        {
            const int size = 20000;

            Stopwatch stopwatch = new Stopwatch();

            stopwatch.Start();
            Enumerable.Range(0, size).Select(n => n.ToString()).Aggregate((a, b) => a + ", " + b);
            stopwatch.Stop();

            Console.WriteLine(stopwatch.ElapsedMilliseconds + "ms");
        }
    }
}

Now when I run it, it says:

42ms

Over 100x faster. So my question is, what's in the MakeAggregateGoFaster namespace?

:)

Update 2: Wrote up my answer here.

+2  A: 

Well, that would depend entirely on what code is in the MageAggregateGoFaster namespace now wouldn't it?

This namespace is not part of the .NET runtime, so you've linked in some custom code.

Personally I would think that something that recognizes string concatenation or similar, and builds up a list, or similar, then allocates one big StringBuilder and uses Append.

A dirty solution would be:

namespace MakeAggregateGoFaster
{
    public static class Extensions
    {
        public static String Aggregate(this IEnumerable<String> source, Func<String, String, String> fn)
        {
            StringBuilder sb = new StringBuilder();
            foreach (String s in source)
            {
                if (sb.Length > 0)
                    sb.Append(", ");
                sb.Append(s);
            }

            return sb.ToString();
        }
    }
}

dirty because this code, while doing what you say you experience with your program, does not use the function delegate at all. It will, however, bring down the execution time from around 2800ms to 11ms on my computer, and still produce the same results.

Now, next time, perhaps you should ask a real question instead of just look how clever I am type of chest-beating?

Lasse V. Karlsen
But I have asked real questions here before. Do let me know if there's actually a rule on Stackoverflow against setting a puzzle or reporting something interesting/useful you've found.
Daniel Earwicker
No, but make sure people know it is a puzzle. I notice that you have now posted a link to your blog. This is generally frowned upon, not just for SO: Asking a question just to make people visit your blog. So post puzzles all you want, but tag them as such, and post your answer here as well.
Lasse V. Karlsen
Your outrage about this seems disproportionate. I just wrote the blog post this morning, because I saw someone suggested it would be better as a blog post. Also SO has gained an interesting question with some great answers - much more valuable than all the "What's your favourite coding food" crap.
Daniel Earwicker
Also mackenir's answer already has all the essential details in it anyway.
Daniel Earwicker
A: 

Not answering the question, but I think the standard patterns here are to use StringBuilder or string.Join:

string.join(", ",Enumerable.Range(0, size).Select(n => n.ToString()).ToArray())
Jimmy
I just timed that, and it looks like if (size > 1,000,000) my version is faster (with the using statement added of course).
Daniel Earwicker
+5  A: 

You are 'overriding' System.Linq.Aggregate with your own extension method in namespace MakeAggregateGoFaster.

Perhaps specialised on IEnumerable<string> and making use of a StringBuilder?

Maybe taking an Expression<Func<string, string, string>> instead of a Func<string, string, string> so it can analyse the expression tree and compile some code that uses StringBuilder instead of calling the function directly?

Just guessing.

mackenir
Close enough although I haven't bother to compile anything yet.
Daniel Earwicker
+1  A: 

The reason I asked if it was a puzzle was because a puzzle is allowed to sacrifice robustness in varying degrees as long as it satisfies the letter of the stated problem. With that in mind, here goes:

Solution 1 (runs instantly, problem doesn't validate):

public static string Aggregate(this IEnumerable<string> l, Func<string, string, string> f) {
     return "";
}

Solution 2 (runs about as fast as the problem requires, but ignores the delegate completely):

public static string Aggregate(this IEnumerable<string> l, Func<string, string, string> f) {
    StringBuilder sb = new StringBuilder();
    foreach (string item in l)
        sb.Append(", ").Append(item);
    return sb.Remove(0,2).ToString();
}
Jimmy
A: 

I like puzzles, just communicate more clearly that it is a puzzle.

tuinstoel
+4  A: 

Why not use one of the other forms of Aggregate?

                Enumerable.Range(0, size ).Aggregate(new StringBuilder(),
                        (a, b) => a.Append(", " + b.ToString()),
                        (a) => a.Remove(0,2).ToString());

You can specify any type for your seed, perform whatever formatting or custom calls in the first lambda function and then customer your output type with the second lambda function. The built in features already provide the flexibility you need. My runs went from 1444ms to 6ms.