184

7
+4  Q:

## Improving a prime sieve algorithm

I'm trying to make a decent Java program that generates the primes from 1 to N (mainly for Project Euler problems).

At the moment, my algorithm is as follows:

Initialise an array of booleans (or a bitarray if N is sufficiently large) so they're all false, and an array of ints to store the primes found.

Set an integer, s equal to the lowest prime, (ie 2)

While s is <= sqrt(N)

Set all multiples of s (starting at s^2) to true in the array/bitarray.

Find the next smallest index in the array/bitarray which is false, use that as the new value of s.

Endwhile.

Go through the array/bitarray, and for every value that is false, put the corresponding index in the primes array.

Now, I've tried skipping over numbers not of the form 6k + 1 or 6k + 5, but that only gives me a ~2x speed up, whilst I've seen programs run orders of magnitudes faster than mine (albeit with very convoluted code), such as the one here

What can I do to improve?

Edit: Okay, here's my actual code (for N of 1E7):

``````int l = 10000000, n = 2, sqrt = (int) Math.sqrt(l);
boolean[] nums = new boolean[l + 1];
int[] primes = new int;

while(n <= sqrt){
for(int i = 2 * n; i <= l; nums[i] = true, i += n);
for(n++; nums[n]; n++);
}

for(int i = 2, k = 0; i < nums.length; i++) if(!nums[i]) primes[k++] = i;
``````

Runs in about 350ms on my 2.0GHz machine.

A:

I bet java's performance is terrible when dealing with bits... Algorithmically, the link you point out should be sufficient

The problem is I don't have a clue what's happening in the link.
what specifically don't you understand ?
The source code in the primes_sieve.c link about half way down seems to be almost unreadable <.<
the html page contains useful info though
Hmm... any idea how one would efficiently implement the stuff it says under the 'choose an appropriate sieve size' subheading. And does Java's VM put stuff into the CPU cache if it so decides, if the 'sequential sieving' uses a sieve small enough to fit in the cache size?
What is your reason for thinking that Java's performance would be terrible when dealing with bits?
I originally thought jvm was unaware of possible special cpu instructions like popcount, but now I think I was wrong : http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Integer.html
+1  A:

You could do the step of "putting the corresponding index in the primes array" while you are detecting them, taking out a run through the array, but that's about all I can think of right now.

I'm not sure this is possible within the code that I posted above. Can you elaborate a bit?
I posted this before your code. So I wouldn't have known. I was just going on your description.
Yeah, fair enough
A:

Have you tried googling, e.g. for "java prime numbers". I did and dug up this simple improvement:

http://www.anyexample.com/programming/java/java_prime_number_check_%28primality_test%29.xml

Surely, you can find more at google.

There are more efficient methods of generating primes than sequenetially applying a primality check.
Of course, but come on, the "how do I generate the first N primes" problem is not exactly new and google should be more than enough to find improvements of every conceivable kind.
+4  A:

While s is <= sqrt(N)
One mistake people often do in such algorithms is not precomputing square root.

``````while (s <= sqrt(N)) {
``````

is much, much slower than

``````int limit = sqrt(N);
while (s <= limit) {
``````

But generally speaking, Eiko is right in his comment. If you want people to offer low-level optimisations, you have to provide code.

You may notice that number of iterations in your code is just little bigger than 'l'. (you may put counter inside first 'for' loop, it will be just 2-3 times bigger) And, obviously, complexity of your solution can't be less then O(l) (you can't have less than 'l' iterations).

What can make real difference is accessing memory effectively. Note that guy who wrote that article tries to reduce storage size not just because he's memory-greedy. Making compact arrays allows you to employ cache better and thus increase speed.

I just replaced boolean[] with int[] and achieved immediate x2 speed gain. (and 8x memory) And I didn't even try to do it efficiently.

update2
That's easy. You just replace every assignment `a[i] = true` with `a[i/32] |= 1 << (i%32)` and each read operation `a[i]` with `(a[i/32] & (1 << (i%32))) != 0`. And `boolean[] a` with `int[] a`, obviously.

From the first replacement it should be clear how it works: if `f(i)` is true, then there's a bit `1` in an integer number `a[i/32]`, at position `i%32` (`int` in Java has exactly 32 bits, as you know).

You can go further and replace `i/32` with `i >> 5`, `i%32` with `i&31`. You can also precompute all `1 << j` for each j between 0 and 31 in array.

But sadly, I don't think in Java you could get close to C in this. Not to mention, that guy uses many other tricky optimizations and I agree that his could would've been worth a lot more if he made comments.

sqrt(N) probably uses natural logarithms to calculate the value which will be slower than doing while (s*s < N)Or pre compute s*s and use that.
Nope, I precomputed the square root of N
Yeah you did, but I think you'll be slightly faster doing while(n*n <= l) instead of while(n <= sqrt)
Tried it with the above code (for N = 1E7); there was no noticable change in speed
Nikita, could you post the code for this int[]ed version? :D
I got a 3x speedup (with the bitwise hacks). This is awesome :D
+2  A:

Using the BitSet will use less memory. The Sieve algorithm is rather trivial, so you can simply "set" the bit positions on the BitSet, and then iterate to determine the primes.

It is actually slower with BitSet. Initialization with new BitSet(l+1) takes 2x as long as new boolean[l+1] and the loops are a bit slower than with boolean[] too. I think the slowdown is caused by the overhead from the method calls set(i), get(i) and nextClearBit(i) instead of accessing values from the array.
Notice how I didn't mention speed :P
+1  A:

Did you also make the array smaller while skipping numbers not of the form 6k+1 and 6k+5? I only tested with ignoring numbers of the form 2k and that gave me ~4x speed up (440 ms -> 120 ms):

``````int l = 10000000, n = 1, sqrt = (int) Math.sqrt(l);
int m = l/2;
boolean[] nums = new boolean[m + 1];
int[] primes = new int;
int i, k;

while (n <= sqrt) {
int x = (n<<1)+1;
for (i = n+x; i <= m; nums[i] = true, i+=x);
for (n++; nums[n]; n++);
}

primes = 2;
for (i = 1, k = 1; i < nums.length; i++) {
if (!nums[i])
primes[k++] = (i<<1)+1;
}
``````
Interesting... for me it gives <2x speed up (350ms -> 190ms)But this is still good, especially considering it can be combined with other optimisations. Thanks! ^_^
+1 yep, pretty neat
A:

The following is from my Project Euler Library...Its a slight Variation of the Sieve of Eratosthenes...I'm not sure, but i think its called the Euler Sieve.

1) It uses a BitSet (so 1/8th the memory) 2) Only uses the bitset for Odd Numbers...(another 1/2th hence 1/16th)

Note: The Inner loop (for multiples) begins at "n*n" rather than "2*n" and also multiples of increment "2*n" are only crossed off....hence the speed up.

``````private void beginSieve(int mLimit)
{
primeList = new BitSet(mLimit>>1);
primeList.set(0,primeList.size(),true);

int sqroot = (int) Math.sqrt(mLimit);
primeList.clear(0);
for(int num = 3; num <= sqroot; num+=2)
{
if( primeList.get(num >> 1) )
{
int inc = num << 1;
for(int factor = num * num; factor < mLimit; factor += inc)
{
//if( ((factor) & 1) == 1)
//{
primeList.clear(factor >> 1);
//}
}
}
}
}
``````

and here's the function to check if a number is prime...

``````public boolean isPrime(int num)
{
if( num < maxLimit)
{
if( (num & 1) == 0)
return ( num == 2);
else
return primeList.get(num>>1);
}
return false;
}
``````