views:

1014

answers:

5

I'm intersecting a set of 100,000 numbers and a set of 1,000 numbers using set_intersection in STL and its taking 21s, where it takes 11ms in C#.

C++ Code:

int runIntersectionTestAlgo()
{   

    set<int> set1;
    set<int> set2;
    set<int> intersection;


    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
     int value = 1000000000 + i;
     set1.insert(value);
    }

    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
     int random = rand() % 200000 + 1;
     random *= 10;

     int value = 1000000000 + random;
     set2.insert(value);
    }

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));

    return intersection.size(); 
}

C# Code:

static int runIntersectionTest()
    {
        Random random = new Random(DateTime.Now.Millisecond);

        Dictionary<int,int> theMap = new Dictionary<int,int>();

        List<int> set1 = new List<int>();
        List<int> set2 = new List<int>();

            // Create 100,000 values for set1
            for ( int i = 0; i < 100000; i++ )
            {
             int value = 1000000000 + i;
             set1.Add(value);
            }

            // Create 1,000 values for set2
            for ( int i = 0; i < 1000; i++ )
            {
             int value = 1000000000 + (random.Next() % 200000 + 1);
             set2.Add(value);
            }

            // Now intersect the two sets by populating the map
        foreach( int value in set1 )
            {
             theMap[value] = 1;
            }

            int intersectionSize = 0;

        foreach ( int value in set2 )
        {
            int count;
            if ( theMap.TryGetValue(value, out count ) )
            {
                intersectionSize++;
                theMap[value] = 2;
            }
            }

            return intersectionSize;
    }
}
+7  A: 

A couple things would make your two examples more comparable.

First, your example in STL isn't quite right, for one thing both sets should be sorted in ascending order (in STL speak a "strict weak ordering").

Second, your using "sets" which are implemented as trees in STL, vs. "lists" which are linked lists. Random inserts into a set is more expensive than inserting onto the end of a list.

Try using a list of ints in the C++ example and also sort the list first (otherwise set inersection won't work properly) and I think you'll see more favorable results.

Chris Harris
+5  A: 

I ran your C++ code on my linux box

$ time ./test

real    0m0.073s
user    0m0.060s
sys     0m0.003s

21s means to me you compiled without optimizations. In case you use MSVC make sure you have listed _SECURE_SCL=0 (see msdn) in the compile definitions. Otherwise all STL iterator operations are dog slow.

Maik Beckmann
+1 for _SCL_SECURE. I hadn't heard of this flag before. Do you know if it's disabled it for release builds?
Richard Corden
This was topic of discussion at the boost mailing list. They said _SCL_SECURE is enabled (set to 1) by default, even in Release mode.
Maik Beckmann
It is enabled by default in debug and release builds. According to a presentation at this year's BoostCon, it will be disabled in release builds in VS2010. :)
jalf
Cool, thank you for this info!
Maik Beckmann
whoops, it looks like it is _SECURE_SCL, not _SCL_SECURE :)
jalf
Hmmm, MSDN agrees with you, jalf. Corrected.
Brian
Thanks I will try out _SECURE_SCL
Alex Black
+1  A: 

I updated your example to use some timer code that I use when unit testing. On my machine I get the following timings (based on -O3):

First loop 0.0040654
Second loop 4.8e-05
Intersection 0.000349
Intersection size: 50

Based on that, if I'm reading my decimals correctly, it takes '4ms' to insert the items into the first set, 50 microseconds to insert the items into the second set and a 1/3 of a ms to perform the intersection.

I am unable to run your C# example on my machine, so I cannot compare the timing, but it's definitely not 21s as you post.

Richard Corden
+2  A: 

On this ancient 3GHz Pentium 4, I get 2734 milliseconds for the entire runIntersectionTestAlgo function, in a debug build with optimizations disabled. I compiled with VS2008 SP1.

If I enable optimizations, I get 93 milliseconds.

Here's my code:

#include <set>
#include <algorithm>

using namespace std;

int runIntersectionTestAlgo()
{   

    set<int> set1;
    set<int> set2;
    set<int> intersection;


    // Create 100,000 values for set1
    for ( int i = 0; i < 100000; i++ )
    {
        int value = 1000000000 + i;
        set1.insert(value);
    }

    // Create 1,000 values for set2
    for ( int i = 0; i < 1000; i++ )
    {
        int random = rand() % 200000 + 1;
        random *= 10;

        int value = 1000000000 + random;
        set2.insert(value);
    }

    set_intersection(set1.begin(),set1.end(), set2.begin(), set2.end(), inserter(intersection, intersection.end()));

    return intersection.size(); 
}

#include <windows.h>
#include <iostream>

int main(){
    DWORD start = GetTickCount();

    runIntersectionTestAlgo();

    DWORD span = GetTickCount() - start;

    std::cout << span << " milliseconds\n";
}

Disabling _SECURE_SCL made no difference for the release build, which still hovered around the 100 ms.

GetTickCount isn't ideal, of course, but it should be good enough to distinguish 21 seconds from less than 100 milliseconds.

So I conclude that there's something wrong with your benchmarks.

jalf
Wow. I took your code, ran it, and it ran in 31ms. I then ran it with the debugger attached (as I'd been doing for my tests) and it ran in 23353 ms.
Alex Black
STL uses a lot of intermediate functions that need to be inlined for performance to be acceptable. GCC at least will let you enable debug and optimization simultaneously, and recent GDB versions can show and step through inlined function calls, so you can debug this stuff at bearable speed ;) You might try turning on inlining with no other optimizations and see what kind of speed you get.
Joseph Garvin
A: 

Your C# and C++ code work differently. The C# code uses magical hashing tricks for speed, your C++ code uses tree tricks for speed. One thing that will probably speed things up (ignoring the fact that your testing seems to be broken) would be to using hashing, as follows:

  1. Create a hash_map of one of the two collections.
  2. Iterate over each element in the 2nd collection. If the `hash_map1 contains that element, add it to your result.
Brian
Yes they are different. I previously tried the C# approach in C++ using each of std:map, stdext:hash_map and boost::unordered_set and got the same poor results.I'm sure if I changed the C# code to use HashSet<int> and .IntersectWith it'd be as fast (or maybe faster).
Alex Black