tags:

views:

159

answers:

4

For the following mass-insert, because the inputs are ordered, are there any (slight) optimizations?

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(i);
}
// then follows Sieve of Eratosthenes algorithm

New improvement, twice as fast:

set<int> primes;
for ( int i = 2; i <= 2000000; i++ ) {
    primes.insert(primes.end(), i);
}
// then follows Sieve of Eratosthenes algorithm
+5  A: 

From http://www.cplusplus.com/reference/stl/set/set/

vector<int> numbers;
for (int i=2; i<=2000000; ++i)
    numbers.push_back(i);

set<int> primes(numbers.begin(), numbers.end());

That should have O(n) complexity, where as yours has O(n*log(n)) complexity.

The referenced link says that when you use the iterator based constructor, if the iterator is sorted, then you get linear. However, I'm not seeing anything explicit on howto explicitly inform the constructor that it's a sorted iterator.


For something cleaner, I'd rather use boost's counting iterator though. That shortens this to:

set<int> primes(
    boost::counting_iterator<int>(0),
    boost::counting_iterator<int>(200000));
sharth
Better yet, only put actual prime numbers into your set, rather than every number.
luke
@luke: The next step of his code would do just that. Or will atleast filter the set to just the primes.
sharth
Also, you know exactly how big that vector will be by the end. You should allocate once (via the constructor) and not multiple times which will really hurt performance.
luke
@sharth, your `vector`-to-`set` code had a 100% run-time improvement. Although, I wasn't sure why, since logically nothing new was happening. Then, this jumped out: `primes.insert(primes.end(), i);`
Derek Illchuk
-1 @sharth: The code is incorrect. `vector<int> v(2000000)` will create a vector with 2M zeroes. The following `push_back` operations will add the first 2M numbers **after** the 2M zeroes. Even if that was correct (update `push_back` to `numbers[i]=i`, the last step `set<int> primes(numbers.begin(), numbers.end())` has the exact same cost ast just adding them hinting `insert` with the `end` iterator. The same goes with the `counting_iterator`, it's performance is not better than the original `insert` with the hint.
David Rodríguez - dribeas
@David: Yeah. I didn't realize there was an option for the hint. That makes for a much cleaner solution.
sharth
+3  A: 

There is a overloaded version of insert method available which takes an iterator as the first parameter. This iterator is used as the hint i.e. the comparison will start from this iterator. So if you pass the proper iterator as the hint, then you should have a very efficient insert operation on the set. See here for details. I believe in your case, numbers.end() -1 will be a good option.

Naveen
+2  A: 

Here's a few:

  1. implement a custom stl allocator that does the reservations for memory upfront.

  2. If you use the vector solution, call reserve.

  3. If you're just going to to implement the sieve use (or implement) a boost counting iterator and only store the results in a vector, which calls reserve.

Rick
+4  A: 

If you are starting with the full range of candidate ints, I would not use a set at all, I would use a vector<bool>, init them all to false, and mark the targets (primes) as true as they are located.

vector<bool> candidates(200000);

You can index this using the current candidate int - 1 (candidate range = 1..200000). Then iterate using find to locate the true values, converting to int by using the offset versus begin().

  • Total number of memory allocations - 1.
  • Total memory usage 25000 bytes (vector<bool> is a special case, see comment from @Greg Rogers) vs >= 800000 for set<int>, discounting BTree overhead.
  • All element access is via pointer arithmetic.
Steve Townsend
For the Sieve of Eratosthenes algorithm, `set` was my first thought and will probably "work", although I don't know yet if its overhead will do it in. So, on second thought, a `vector` of `bool`s is a very good option. I'll implement both and see. Thanks.
Derek Illchuk
@Derek - I believe this will scale better - as the size of your upper limit grows, the overhead in managing the set will get worse. `vector` **might** allow a higher absolute value on your upper limit N to be implemented on the same hardware although it does depend on contiguous memory of size N - you could switch to `deque<bool>` if that becomes an issue. For low values of your range top, either `set` or `vector` will work just fine. I would be interested in your results, if you can be bothered to report back.
Steve Townsend
For all input sizes I tested, the `vector` (vs. the `set`) solution runs ~4.5x faster. Thanks again.
Derek Illchuk
@Derek - many thanks for the feedback, all the best
Steve Townsend
This is the best suggestion for Sieve-of-Eratosthenes. However, if the range is fixed at compile time, then I would tend use `std::bitset< 200000 > candidates;`
ArunSaha
`vector<bool>` is a specialization that packs the bits tightly into each byte - so the memory allocated is actually 200000 / 8 = 40000.
Greg Rogers
@Greg - thanks, edited this in (check arithmetic?)
Steve Townsend
Ooops, yes 25000, not sure where 40000 came from...
Greg Rogers
@Greg - np, the tip was most useful
Steve Townsend