ansaurus

Question

Answer 1

+2 A:

Firstly structs don't necessarily appear on the heap - they can and often do appear on the stack.

Regarding your performance measurements, I think you must have tested it incorrectly. Using this benchmarking code I get almost the same performance results for both types:

TwoFloats[] a = new TwoFloats[10000];
float[] b = new float[20000];

void test1()
{
    int count = 0;
    for (int i = 0; i < 10000; i += 1)
    {
        if (a[i].x < 10) count++;
    }
}

void test2()
{
    int count = 0;
    for (int i = 0; i < 20000; i += 2)
    {
        if (b[i] < 10) count++;
    }
}

Results:

Method   Iterations per second
test1                 55200000
test2                 54800000

Mark Byers 2010-10-05 20:15:59

@Mark: According to the OP, he uses an array that uses twice as many elements in the case that he uses `float[]` instead of `MyStruct`. So, the struct array will not be twice as large, unless the struct uses more than 4 bytes per `float` due to offsetting issues or whatever (e.g. maybe on a 64-bit machine).

Brian 2010-10-05 20:22:24

As an aside, I tried my own version of your benchmark comparing `a[i].x` to `a[i].y` or `b[i]` to `b[i+1]`. Unsurprisingly, the results are similar to yours.

Brian 2010-10-05 20:28:48

Your code runs as expected... I figured, that if I use Lit<TwoFloats> vs simple array, List is significantly slower (110ms for struct, 46ms for 2 floats) on your code (with 20000000 items). And List<T> doesn´t do boxing

Perry 2010-10-06 06:57:41

Answer 2

A:

The most likely reason is that the C# runtime optimizer perform a better job when you work with floats that with full structs, probably because optimizer is mapping x and y to registers or likewise changes not done with full struct.

In your particular example there seems not to be any fundamental reason why it couldn't perform as good a job when you use structs (it's hard to be sure without seeing you actual benchmarking code), but it just doesn't. However it would be interesting to compare the performance of the resulting code when compiled with another C# implementations (I'm thinking of mono on Linux).

I tested Ron Warholic benchmark with mono, and results are consistant with Mark's, difference between the two types of access seems to be minimal (version with floats is 1% faster). However I still should do more testing as it is not unexpected that library calls like Math.Abs take a large amount of time and it could hide a real difference.

After removing calls to Math.Abs and just doing tests like rawFloats[i] < rawFloats[j] the structure version becomes marginally faster (about 5%) than the two arrays of floats.

kriss 2010-10-05 20:32:09

This exacerbates the results when I run Ron Warholic's benchmark, but does not fully account for them.

Brian 2010-10-05 20:38:06

@Brian: Ron Warholic benchmark is very interesting, looking at his code I would bet for cache effect. It is quite easy to test: just access the same amount of entries but in random order instead of stepping by 1 and check if it changes something.

kriss 2010-10-05 20:47:44

@kriss/Brian: Error in the benchmark actually. I've done similar things before with structs and never noticed a speed differential which is why I highly suspected an error in my benchmark. The updated code shown shows nearly identical behavior (rightly so).

Ron Warholic 2010-10-05 20:50:53

@Ron Warholic: now that's more normal behavior. I feel dumb I didn't spotted the problem in benchmark...

kriss 2010-10-05 21:16:13

Answer 3

+1 A:

I get results that seem to agree with you (and disagree with Mark). I'm curious if I've made a mistake constructing this (albeit crude) benchmark or if there is another factor at play.

Compiled on MS C# targeting .NET 3.5 framework with VS2008. Release mode, no debugger attached.

Here's my test code:

class Program {
    static void Main(string[] args) {
        for (int i = 0; i < 10; i++) {
            RunBench();
        }

        Console.ReadKey();
    }

    static void RunBench() {
        Stopwatch sw = new Stopwatch();

        const int numPoints = 10000;
        const int numFloats = numPoints * 2;
        int numEqs = 0;
        float[] rawFloats = new float[numFloats];
        Vec2[] vecs = new Vec2[numPoints];

        Random rnd = new Random();
        for (int i = 0; i < numPoints; i++) {
            rawFloats[i * 2] = (float) rnd.NextDouble();
            rawFloats[i * 2 + 1] = (float)rnd.NextDouble();
            vecs[i] = new Vec2() { X = rawFloats[i * 2], Y = rawFloats[i * 2 + 1] };
        }

        sw.Start();
        for (int i = 0; i < numFloats; i += 2) {
            for (int j = 0; j < numFloats; j += 2) {
                if (i != j &&
                    Math.Abs(rawFloats[i] - rawFloats[j]) < 0.0001 &&
                    Math.Abs(rawFloats[i + 1] - rawFloats[j + 1]) < 0.0001) {
                    numEqs++;
                }
            }
        }
        sw.Stop();

        Console.WriteLine(sw.ElapsedMilliseconds.ToString() + " : numEqs = " + numEqs);

        numEqs = 0;
        sw.Reset();
        sw.Start();
        for (int i = 0; i < numPoints; i++) {
            for (int j = 0; j < numPoints; j++) {
                if (i != j &&
                    Math.Abs(vecs[i].X - vecs[j].X) < 0.0001 &&
                    Math.Abs(vecs[i].Y - vecs[j].Y) < 0.0001) {
                    numEqs++;
                }
            }
        }
        sw.Stop();

        Console.WriteLine(sw.ElapsedMilliseconds.ToString() + " : numEqs = " + numEqs);
    }
}

struct Vec2 {
    public float X;
    public float Y;
}

Edit: Ah! I wasn't iterating the proper amounts. With the updated code my timings look like I expected:

269 : numEqs = 8
269 : numEqs = 8
270 : numEqs = 2
269 : numEqs = 2
268 : numEqs = 4
270 : numEqs = 4
269 : numEqs = 2
268 : numEqs = 2
270 : numEqs = 6
270 : numEqs = 6
269 : numEqs = 8
268 : numEqs = 8
268 : numEqs = 4
270 : numEqs = 4
269 : numEqs = 6
269 : numEqs = 6
268 : numEqs = 2
270 : numEqs = 2
268 : numEqs = 4
270 : numEqs = 4

Ron Warholic 2010-10-05 20:33:12

AH! I figured there was some problem I overlooked. I actually caught it just before seeing your comment ;)

Ron Warholic 2010-10-05 20:46:41

Answer 4

+1 A:

You are doing something seriously wrong if you get times like that. Float comparisons are dramatically fast, I clock them at 2 nanoseconds with the loop overhead. Crafting a test like this is tricky because the JIT compiler will optimize stuff away if you don't use the result of the comparison.

The structure is slightly faster, 1.96 nanoseconds vs 2.20 nanoseconds for the float[] on my laptop. That's the way it should be, accessing the Y member of the struct doesn't cost an extra array index.

Test code:

using System;
using System.Diagnostics;

class Program {
    static void Main(string[] args) {
        var test1 = new float[100000000];  // 100 million
        for (int ix = 0; ix < test1.Length; ++ix) test1[ix] = ix;
        var test2 = new Test[test1.Length / 2];
        for (int ix = 0; ix < test2.Length; ++ix) test2[ix].x = test2[ix].y = ix;
        for (int cnt = 0; cnt < 20; ++cnt) {
            var sw1 = Stopwatch.StartNew();
            bool dummy = false;
            for (int ix = 0; ix < test1.Length; ix += 2) {
                dummy ^= test1[ix] >= test1[ix + 1];
            }
            sw1.Stop();
            var sw2 = Stopwatch.StartNew();
            for (int ix = 0; ix < test2.Length; ++ix) {
                dummy ^= test2[ix].x >= test2[ix].y;
            }
            sw2.Stop();
            Console.Write("", dummy);
            Console.WriteLine("{0} {1}", sw1.ElapsedMilliseconds, sw2.ElapsedMilliseconds);
        }
        Console.ReadLine();
    }
    struct Test {
        public float x;
        public float y;
    }
}

Hans Passant 2010-10-05 20:38:20

Answer 5

A:

The code below is based on different ways of iteration. On my machine, Test1b takes almost twice as long as Test1a. I wonder if this relates to your issue.

class Program
{
    struct TwoFloats
    {
        public float x;
        public float y;
    }

    static TwoFloats[] a = new TwoFloats[10000];

    static int Test1a()
    {
        int count = 0;
        for (int i = 0; i < 10000; i += 1)
        {
            if (a[i].x < a[i].y) count++;
        }
        return count;
    }

    static int Test1b()
    {
        int count = 0;
        foreach (TwoFloats t in a)
        {
            if (t.x < t.y) count++;
        }
        return count;
    }

    static void Main(string[] args)
    {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        for (int j = 0; j < 5000; ++j)
        {
            Test1a();
        }
        sw.Stop();
        Trace.WriteLine(sw.ElapsedMilliseconds);
        sw.Reset();
        sw.Start();
        for (int j = 0; j < 5000; ++j)
        {
            Test1b();
        }
        sw.Stop();
        Trace.WriteLine(sw.ElapsedMilliseconds);
    }

}

Brian 2010-10-05 21:47:18

foreach and for-loop give me almost same results

Perry 2010-10-06 06:53:14

ansaurus

tags:

views:

answers:

Why is struct slower than float ?

related questions