tags:

views:

181

answers:

5

First off, I know my title can be formulated better, but my math classes are so far gone I can't remember the correct words anymore..

I need to do something like this (pseudo c#)

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int result = digits1*digits2

This would be the sum of the product of element[i] of each array.

This obviously doesn't work. Any suggestions towards either a better title or the solution?

EDIT clarification: I know I could loop them both and do the math. Basically I would think there is a better way to do this and I'm looking for it purely out of personal curiousity.

+3  A: 
int result = 0;
for(int i = 0; i < digits1.length; i++)
{
    result += digits1[i] * digits2[i];
}
Alan Geleynse
Well yes, but isn't there a better algorithm for this?
borisCallens
I don't think so, you need to do at least n multiplications and n additions, and this does that. I believe it is the least amount of work possible. There may be ways to do it with less code, but the work the code does will be the same.
Alan Geleynse
Well looks like there is a better way to do it, its been too long since I have done C#. But I believe the runtime should be similar for Zip and the loop since the same work needs to be done.
Alan Geleynse
I agree with the above. The short, fancy ways of doing this work, and they're concise, but they may not be as clear to someone else reading the code later.
JoshD
Mathematically you are right, but I would think an average processor would have optimizations for working with arrays (matrices) that could be used by the .net framework. I know it is optimization circus, but I'm looking into this purely out of curiosity.
borisCallens
Can someone run a performance test? @BrunoLM ran one for the LINQ solutions, but I would be interested to compare it to the loop and I do not have a C# environment available right now.
Alan Geleynse
@Alan: I ran a test. [Link](http://stackoverflow.com/questions/3992363/sum-of-products-of-two-arrays-dotproduct/3992840#3992840)
Jeff M
Just because you *can* do it with LINQ doesn't mean you have to. I prefer loops because it is more readable in this case. In fact you can make a static method, or an extension method called `Dot` and that would make the code even _more_ readable.
jalexiou
+3  A: 

Very simply, do a loop.

int sum = 0;
for(int i = 0; i < digits1.length && i < digits2.length; i++)
{
    sum += digits1[i] * digits2[i];
}

Boom.

JoshD
+10  A: 

With LINQ:

int dotProduct = digits1.Zip(digits2, (d1, d2) => d1 * d2)
                        .Sum();

Zipwill produce a streaming sequence containing the products of corresponding elements from both arrays, which is then summed into an integer with Sum.

Note that this will not fail like it should when the arrays of unequal length, so you probably need to validate the input:

//null checks here

if(digits1.Length != digits2.Length)
   throw new ArgumentException("...");

EDIT: As Jeff M points out,Enumerable.Zipwas only added to the framework in .NET 4.0. In .NET 3.5, you can do this (the idea is only efficient for collections that expose fast indexers):

int dotProduct = Enumerable.Range(0, digits1.Length)
                           .Sum(i => digits1[i] * digits2[i]);

//from Jeff M's comment:
int dotProduct = digits1.Select((n, i) => n * digits2[i])
                        .Sum();
Ani
For .NET 3.5: `int result = digits1.Select((n, i) => n * digits2[i]).Sum();`
Jeff M
I didn't know about the Zip function. I will check it out. Thanks :)
borisCallens
@boris: You should note that the `Zip()` extension method was added in .NET 4.0.
Jeff M
No problem. Thanks for the heads up. Also, since when is there a 3 minute rule for accepting answers?
borisCallens
@Jeff M: Thanks, good point.
Ani
@Jeff M: Edited your comment in with attribution; hope that's alright.
Ani
@Ani: Don't mind at all. ;)
Jeff M
+5  A: 

Solutions with LINQ

int[] digits1 = new int[10]{0,1,2,3,4,5,6,7,8,9};
int[] digits2 = new int[10]{0,1,2,3,4,5,6,7,8,9};

int result1 = digits1.Zip(digits2, (x, y) => x * y).Sum();

int result2 = digits1.Select((x, y) => x * digits2.ElementAt(y)).Sum();

int result3 = digits1.Select((n, i) => n * digits2[i]).Sum();

// Ani answer
int result4 = Enumerable.Range(0, digits1.Length)
    .Sum(i => digits1[i] * digits2[i]);

Performance test 100000 iterations:

Queries
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614
Fn: Result 3       Ticks 130034
Fn: Result 4       Ticks 123374

-------------

Fastest
Fn: Result 4       Ticks 123374
Fn: Result 3       Ticks 130034
Fn: Result 1       Ticks 135306
Fn: Result 2       Ticks 2470614
BrunoLM
This is interesting. Thanks. I wonder where the diff from 2 comes from.
borisCallens
+2  A: 

I wrote a test bench to compare the these methods' times on my machine.

Specs:
Windows 7 Professional 64-bit
Intel Core 2 Quad Q9550 @ 2.83GHz
4x1GiB Corsair Dominator DDR2 1066 (PC2-8500)

using System;
using System.Linq;

namespace Testbench
{
    class Program
    {
        static void Main(string[] args)
        {
            var digits1 = Enumerable.Range(0, 500).ToArray();
            var digits2 = digits1.ToArray(); // create a copy

            Test("Regular Loop", () =>
            {
                int result = 0;
                for (int i = 0; i < digits1.Length; i++)
                {
                    result += digits1[i] * digits2[i];
                }
                return result;
            });

            // using LINQ
            Test("Enumerable \"Loop\"", () => Enumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Zip", () => digits1.Zip(digits2, (x, y) => x * y).Sum());
            Test("Using Indexed Select", () => digits1.Select((n, i) => n * digits2[i]).Sum());
            Test("Using Indexed Select with ElementAt", () => digits1.Select((n, i) => n * digits2.ElementAt(i)).Sum());

            // using PLINQ
            Test("Parallel Enumerable \"Loop\"", () => ParallelEnumerable.Range(0, digits1.Length).Sum(i => digits1[i] * digits2[i]));
            Test("Using Parallel Zip", () => digits1.AsParallel().Zip(digits2.AsParallel(), (x, y) => x * y).Sum());
            Test("Using Parallel Indexed Select", () => digits1.AsParallel().Select((n, i) => n * digits2[i]).Sum());
            Test("Using Parallel Indexed Select with ElementAt", () => digits1.AsParallel().Select((n, i) => n * digits2.ElementAt(i)).Sum());

            Console.Write("Press any key to continue . . . ");
            Console.ReadKey(true);
            Console.WriteLine();
        }

        static void Test<T>(string testName, Func<T> test, int iterations = 1000000)
        {
            Console.WriteLine(testName);
            Console.WriteLine("Iterations: {0}", iterations);
            var results = Enumerable.Repeat(0, iterations).Select(i => new System.Diagnostics.Stopwatch()).ToList();
            var timer = System.Diagnostics.Stopwatch.StartNew();
            for (int i = 0; i < results.Count; i++)
            {
                results[i].Start();
                test();
                results[i].Stop();
            }
            timer.Stop();
            Console.WriteLine("Time(ms): {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedMilliseconds), results.Average(t => t.ElapsedMilliseconds), results.Max(t => t.ElapsedMilliseconds), timer.ElapsedMilliseconds);
            Console.WriteLine("Ticks:    {0,3}/{1,10}/{2,8} ({3,10})", results.Min(t => t.ElapsedTicks), results.Average(t => t.ElapsedTicks), results.Max(t => t.ElapsedTicks), timer.ElapsedTicks);
            Console.WriteLine();
        }
    }
}

32-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (      1172)
Ticks:      3/  3.101365/     526 (   3244251)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   4.3E-05/      25 (      9054)
Ticks:     24/  24.93989/   69441 (  25052172)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.4E-05/      16 (     16282)
Ticks:     41/ 44.941406/   45395 (  45052491)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   5.3E-05/      32 (     13473)
Ticks:     34/ 37.165088/   89602 (  37280177)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   1.5E-05/       6 (    160215)
Ticks:    405/443.154147/   17821 ( 443306156)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/  0.000103/      29 (     17194)
Ticks:     38/ 47.412312/   81906 (  47576133)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/   9.4E-05/      19 (     21703)
Ticks:     49/ 59.859005/   53200 (  60051081)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/  0.000114/      27 (     20579)
Ticks:     45/ 56.758491/   75455 (  56943627)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   8.1E-05/      19 (     61137)
Ticks:    144/ 168.97909/   53320 ( 169165086)

64-bit target:

Regular Loop
Iterations: 1000000
Time(ms):   0/         0/       0 (       506)
Ticks:      1/  1.254137/    1491 (   1401969)

Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   2.9E-05/      15 (     10118)
Ticks:     27/ 27.850086/   41954 (  27995994)

Using Zip
Iterations: 1000000
Time(ms):   0/   2.2E-05/      13 (     17089)
Ticks:     45/ 47.132834/   38506 (  47284608)

Using Indexed Select
Iterations: 1000000
Time(ms):   0/   3.1E-05/      12 (     14057)
Ticks:     37/ 38.740923/   33846 (  38897274)

Using Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   3.8E-05/      29 (    117412)
Ticks:    304/324.711279/   82726 ( 324872753)

Parallel Enumerable "Loop"
Iterations: 1000000
Time(ms):   0/   9.9E-05/      28 (     24234)
Ticks:     38/  66.79389/   77578 (  67054956)

Using Parallel Zip
Iterations: 1000000
Time(ms):   0/  0.000111/      24 (     30193)
Ticks:     46/ 83.264037/   69029 (  83542711)

Using Parallel Indexed Select
Iterations: 1000000
Time(ms):   0/   6.5E-05/      20 (     28417)
Ticks:     45/ 78.337831/   56354 (  78628396)

Using Parallel Indexed Select with ElementAt
Iterations: 1000000
Time(ms):   0/   9.2E-05/      16 (     65233)
Ticks:    112/180.154663/   44799 ( 180496754)
Jeff M
Wow, I figured the loop would be faster, but not by that much. The LINQ methods are definitely shorter and arguable simpler to write, but create a lot of overhead. Thanks for running the test.
Alan Geleynse
Wouldn't the linq methods be faster in a multi threading scenario?
borisCallens
@boris: No it wouldn't. While there are optimization opportunities, it will only use a single thread. If changed to use PLINQ, then yes definitely. I'll make modifications to include them and repost my results.
Jeff M
@boris: I said "yes definitely," it does have some caveats. If the arrays were sufficiently large, then "yes definitely." Otherwise, it's unnecessary overhead and might not benefit.
Jeff M
@boris: I'll be updating one more time. (I need to find a better way to measure the times) Restructuring the test function improved the times all around (due to an overhead I didn't anticipate). The numbers are more reasonable.
Jeff M
Wow. Thanks for all the effort. It seems the regular for loop still is more interesting for my scenario.
borisCallens