views:

5788

answers:

19

I want to print the first 10000 prime numbers. Can anyone give me the most efficient code for this? Clarifications:

  1. It does not matter if your code is inefficient for n >10000.
  2. The size of the code does not matter.
  3. You cannot just hard code the values in any manner.
+21  A: 

The Sieve of Atkin is probably what you're looking for, its upper bound running time is O(N/log log N).

Matt
Sieve of Eratosthenes could be faster for small N. See my answer.
J.F. Sebastian
Either way, "The Sieve of Atkin" and "Sieve of Eratosthenes" are pretty badass sounding names.
Simucal
Though this is a good answer both Sieves only generate primes in the range [2, N], rather than the *first N primes*.
Daniel
Daniel: the 10,000th prime is less than 105,000 so he just has to hardcode his sieve to go up to 105,000.
Gabe
A: 

Not efficient at all, but you can use a regular expression to test for prime numbers.

/^1?$|^(11+?)\1+$/

This tests if, for a string consisting of k1”s, k is not prime (i.e. whether the string consists of one “1” or any number of “1”s that can be expressed as an n-ary product).

engtech
+3  A: 

I have adapted code found on the CodeProject to create the following:

ArrayList primeNumbers = new ArrayList();

for(int i = 2; primeNumbers.Count < 10000; i++) {
    bool divisible = false;

    foreach(int number in primeNumbers) {
        if(i % number == 0) {
            divisible = true;
        }
    }

    if(divisible == false) {
        primeNumbers.Add(i);
        Console.Write(i + " ");
    }
}

Testing this on my ASP.NET Server took the rountine about 1 minute to run.

GateKiller
You can speed that up if you exit that foreach loop when you get to number>squareroot(i).
Clayton
1min for 10000 is pretty slow. In C# (not utilizing parallel fors/foreaches), on average I get 13seconds up to 10,000,000. Using one parallel for I get on average 10seconds for the same bound.
jlafay
A: 

@engtech: Your regular expression is wrong.

First of all, 1 isn't a prime number.

Second, all the second part of the expression does is check to see if there are a bunch of '1's as the only input.

(Also, that / on the end breaks it.)

Ryan Fox
+15  A: 

I recommend a sieve, either the Sieve of Eratosthenes or the Sieve of Atkin.

The sieve or Eratosthenes is probably the most intuitive method of finding a list of primes. Basically you:

  1. Write down a list of numbers from 2 to whatever limit you want, let's say 1000.
  2. Take the first number that isn't crossed off (for the first iteration this is 2) and cross off all multiples of that number from the list.
  3. Repeat step 2 until you reach the end of the list. All the numbers that aren't crossed off are prime.

Obviously there are quite a few optimizations that can be done to make this algorithm work faster, but this is the basic idea.

The sieve of Atkin uses a similar approach, but unfortunately I don't know enough about it to explain it to you. But I do know that the algorithm I linked takes 8 seconds to figure out all the primes up to 1000000000 on an ancient Pentium II-350

Sieve of Eratosthenes Source Code: http://primes.utm.edu/links/programs/sieves/Eratosthenes/C_source_code/

Sieve of Atkin Source Code: http://cr.yp.to/primegen.html

num1
`primegen` is worth a look.
J.F. Sebastian
+1  A: 

Sieve or Eratosthenes is the way to go, because of it's simplicity and speed. My implementation in C

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <math.h>

int main(void)
{
    unsigned int lim, i, j;

    printf("Find primes upto: ");
    scanf("%d", &lim);
    lim += 1;
    bool *primes = calloc(lim, sizeof(bool));

    unsigned int sqrtlim = sqrt(lim);
    for (i = 2; i <= sqrtlim; i++)
        if (!primes[i])
            for (j = i * i; j < lim; j += i)
                primes[j] = true;

    printf("\nListing prime numbers between 2 and %d:\n\n", lim - 1);
    for (i = 2; i < lim; i++)
        if (!primes[i])
            printf("%d\n", i);

    return 0;
}

CPU Time to find primes (on Pentium Dual Core E2140 1.6 GHz, using single core)

~ 4s for lim = 100,000,000

Imran
what is the time for lim=1_000_000 ? It can't be both `<1s' and '5s'.
J.F. Sebastian
Name `primes` is misleading, in your code its meaning `is_composite_number`. You may eliminate the first loop if you replace `malloc` by `calloc`. Expression `j+=i` can overflow for large `lim` (and you'll miss some primes in that case).
J.F. Sebastian
Fixed. < 1s for 100,000, ~5s for 1,000,000Thanks for suggesting `calloc` and the alternative array name. I also thought primes[] is quite confusing, but couldn't think of a better name.
Imran
Replacing the loop with `calloc` now gets lim = 100,000,000 done in ~4s
Imran
+4  A: 

GateKiller, how about adding a break to that if in the foreach loop? That would speed up things a lot because if like 6 is divisible by 2 you don't need to check with 3 and 5. (I'd vote your solution up anyway if I had enough reputation :-) ...)

ArrayList primeNumbers = new ArrayList();

for(int i = 2; primeNumbers.Count < 10000; i++) {
    bool divisible = false;

    foreach(int number in primeNumbers) {
        if(i % number == 0) {
            divisible = true;
            break;
        }
    }

    if(divisible == false) {
        primeNumbers.Add(i);
        Console.Write(i + " ");
    }
}
palotasb
You can still speed this up considerably by also breaking if number > sqrt(i).
Beska
+3  A: 

Using GMP, one could write the following:

#include <stdio.h>
#include <gmp.h>

int main() {
  mpz_t prime;
  mpz_init(prime);
  mpz_set_ui(prime, 1);
  int i;
  char* num = malloc(4000);
  for(i=0; i<10000; i++) {
    mpz_nextprime(prime, prime);
    printf("%s, ", mpz_get_str(NULL,10,prime));
  }
}

On my 2.33GHz Macbook Pro, it executes as follows:

time ./a.out > /dev/null

real    0m0.033s
user    0m0.029s
sys    0m0.003s

Calculating 1,000,000 primes on the same laptop:

time ./a.out > /dev/null

real    0m14.824s
user    0m14.606s
sys     0m0.086s

GMP is highly optimized for this sort of thing. Unless you really want to understand the algorithms by writing your own, you'd be advised to use libGMP under C.

hoyhoy
A: 

Can you explain the requirement not to hardcode the values?

More exactly, can you provide an "algorithm" that given a program, will determine if the values are hardcoded?

+10  A: 

This isn't strictly against the hardcoding restriction, but comes terribly close. Why not programatically download this list and print it out, instead?

http://primes.utm.edu/lists/small/10000.txt

John the Statistician
For most computers, calculating the values would be quicker than the latency involved in downloading them over the internet.
Corey
But not from having the list ready in memory. That's probably what he meant.
Sebastian Krog
"Sieve of Google"
Kevin L.
lol @krog. why would you bother to set up a network connection and all that jazz to DL a static file each time? of course you'd predownload it, heck, even hardcode it into an array.
Mark
+4  A: 

@Matt: log(log(10000)) is ~2

From the wikipedia article (which you cited) Sieve of Atkin:

This sieve computes primes up to N using O(N/log log N) operations with only N1/2+o(1) bits of memory. That is a little better than the sieve of Eratosthenes which uses O(N) operations and O(N1/2(log log N)/log N) bits of memory (A.O.L. Atkin, D.J. Bernstein, 2004). These asymptotic computational complexities include simple optimizations, such as wheel factorization, and splitting the computation to smaller blocks.

Given asymptotic computational complexities along O(N) (for Eratosthenes) and O(N/log(log(N))) (for Atkin) we can't say (for small N=10_000) which algorithm if implemented will be faster.

Achim Flammenkamp wrote in The Sieve of Eratosthenes:

cited by:

@num1

For intervals larger about 10^9, surely for those > 10^10, the Sieve of Eratosthenes is outperformed by the Sieve of Atkins and Bernstein which uses irreducible binary quadratic forms. See their paper for background informations as well as paragraph 5 of W. Galway's Ph.D. thesis.

Therefore for 10_000 Sieve of Eratosthenes can be faster then Sieve of Atkin.

To answer OP the code is prime_sieve.c (cited by num1)

J.F. Sebastian
+1  A: 

If you really just want the print out then a google search followed by a print is the fastest. :-)

Kevin Gale
Good search keywords: prime number table.Search turned up for example this site: http://www.walter-fendt.de/m14e/primes.htm
Juha Syrjälä
Calculating the prime numbers up to 10000 is much faster.
Georg
Not if you count writing the code. :-)
Kevin Gale
A: 

Adapting and following on from GateKiller, here's the final version that I've used.

    public IEnumerable<long> PrimeNumbers(long number)
    {
        List<long> primes = new List<long>();
        for (int i = 2; primes.Count < number; i++)
        {
            bool divisible = false;

            foreach (int num in primes)
            {
                if (i % num == 0)
                    divisible = true;

                if (num > Math.Sqrt(i))
                    break;
            }

            if (divisible == false)
                primes.Add(i);
        }
        return primes;
    }

It's basically the same, but I've added the "break on Sqrt" suggestion and changed some of the variables around to make it fit better for me. (I was working on Euler and needed the 10001th prime)

Pat
+1  A: 

Sieves are your best friend. My code is at http://im.jabagawee.com/pb/03434878.txt, but it doesn't get the first 10,000 primes. It should get about 10,500 of them though, so I think it'd work well for you.

Andrew Szeto
A: 

What's the expectation if you get asked this in an interview? That you've gone through the books and are familiar with the theory? Could people be honestly expected to come up with such an analysis and algorithm in a short interview if they were not already familiar with the theory?

Sumit Kishore
+1  A: 

The Sieve seems to be the wrong answer. The sieve gives you the primes up to a number N, not the first N primes. Run @Imran or @Andrew Szeto, and you get the primes up to N.

The sieve might still be usable if you keep trying sieves for increasingly larger numbers until you hit a certain size of your result set, and use some caching of numbers already obtained, but I believe it would still be no faster than a solution like @Pat's.

Sumit Kishore
+1  A: 

Here is a Sieve of Eratosthenes that I wrote in PowerShell a few days ago. It has a parameter for identifying the number of prime numbers that should be returned.

#
# generate a list of primes up to a specific target using a sieve of eratosthenes
#
function getPrimes { #sieve of eratosthenes, http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
    param ($target,$count = 0)
    $sieveBound = [math]::ceiling(( $target - 1 ) / 2) #not storing evens so count is lower than $target
    $sieve = @($false) * $sieveBound
    $crossLimit = [math]::ceiling(( [math]::sqrt($target) - 1 ) / 2)
    for ($i = 1; $i -le $crossLimit; $i ++) {
        if ($sieve[$i] -eq $false) {
            $prime = 2 * $i + 1
            write-debug "Found: $prime"
            for ($x = 2 * $i * ( $i + 1 ); $x -lt $sieveBound; $x += 2 * $i + 1) {
                $sieve[$x] = $true
            }
        }
    }
    $primes = @(2)
    for ($i = 1; $i -le $sieveBound; $i ++) {
        if($count -gt 0 -and $primes.length -ge $count) {
            break;
        }
        if($sieve[$i] -eq $false) {
            $prime = 2 * $i + 1
            write-debug "Output: $prime"
            $primes += $prime
        }
    }
    return $primes
}
spoon16
+1  A: 

alt text

Vadakkumpadath
The Sieve of Eratosthenes is a simple algorithm for finding all prime numbers up to a specified integer. It is the predecessor to the modern Sieve of Atkin, which is faster but more complex. The Sieve of Eratosthenes was created in the 3rd century BC by Eratosthenes, an ancient Greek mathematician.
Vadakkumpadath
A: 

In Python

import gmpy
p=1
for i in range(10000):
    p=gmpy.next_prime(p)
    print p 
gnibbler