views:

281

answers:

5

Hey, I'm trying to generate a list of primes below 1 billion. I'm trying this, but this kind of structure is pretty shitty. Any suggestions?

a <- 1:1000000000
d<- 0
b <- for (i in a) {for (j in 1:i) {if (i %% j !=0) {d <- c(d,i)}}}
+5  A: 

Best way that I know of to generate all primes (without getting into crazy math) is to use the Sieve of Eratosthenes.

It is pretty straightforward to implement and allows you calculate primes without using division or modulus. The only downside is that it is memory intensive, but various optimizations can be made to improve memory (ignoring all even numbers for instance).

Stargazer712
+2  A: 

You can also cheat and use the primes() function in the schoolmath package :D

nico
NO! There is a bug in the primes() function in schoolmath pointed at by Neil Gunther! Read here: http://perfdynamics.blogspot.com/2010/06/primes-in-r-part-iii-schoolmath-is.htmlsee also here: http://perfdynamics.blogspot.com/2010/06/playing-with-primes-in-r-part-ii.htmland here: http://xianblog.wordpress.com/2010/06/14/bug-in-schoolmath/
Henrik
oh, good link, didn't know.
nico
+1  A: 

I recommend primegen, Dan Bernstein's implementation of the Atkin-Bernstein sieve. It's very fast and will scale well to other problems. You'll need to pass data out to the program to use it, but I imagine there are ways to do that?

Charles
@Charles : Looks like a C-implementation. With the right tweaking you can incorporate primegen in R without problem (see ?.C)
Joris Meys
@Joris: I didn't know you could do that! Thanks for the heads-up. (I was imagining calling it as an external.)
Charles
+7  A: 

This is an implementation of the Sieve of Eratosthenes algorithm in R.

sieve <- function(n)
{
   n <- as.integer(n)
   if(n > 1e6) stop("n too large")
   primes <- rep(TRUE, n)
   primes[1] <- FALSE
   last.prime <- 2L
   for(i in last.prime:floor(sqrt(n)))
   {
      primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
      last.prime <- last.prime + min(which(primes[(last.prime+1):n]))
   }
   which(primes)
}

 sieve(1000000)
gd047
This is a good implementation of the algorithm but because we're using R it's SLOW... see my response below.
John
+3  A: 

That sieve posted by gd047 is a good starting point. Here's a much faster version with running times for 1e6 primes of 0.095s as opposed to 30s for the original version.

sieve <- function(n)
{
   n <- as.integer(n)
   if(n > 1e8) stop("n too large")
   primes <- rep(TRUE, n)
   primes[1] <- FALSE
   last.prime <- 2L
   fsqr <- floor(sqrt(n))
   while (last.prime <= fsqr)
   {
      primes[seq.int(2L*last.prime, n, last.prime)] <- FALSE
      sel <- which(primes[(last.prime+1):(fsqr+1)])
      if(any(sel)){
        last.prime <- last.prime + min(sel)
      }else last.prime <- fsqr+1
   }
   which(primes)
}

Here's some alternate algorithms below coded about as fast as possible in R. They are slower than the sieve but a heck of a lot faster than the questioners original post.

Here's a recursive function that uses mod but is vectorized. It returns for 1e5 almost instantaneously and 1e6 in under 2s.

primes <- function(n){
    primesR <- function(p, i = 1){
        f <- p %% p[i] == 0 & p != p[i]
        if (any(f)){
            p <- primesR(p[!f], i+1)
        }
        p
    }
    primesR(2:n)
}

The next one isn't recursive and faster again. The code below does primes up to 1e6 in about 1.5s on my machine.

primest <- function(n){
    p <- 2:n
    i <- 1
    while (p[i] <= sqrt(n)) {
        p <-  p[p %% p[i] != 0 | p==p[i]]
        i <- i+1
    }
    p
}

BTW, the spuRs package has a number of prime finding functions including a sieve of E. Haven't checked to see what the speed is like for them.

And while I'm writing a very long answer... here's how you'd check in R if one value is prime.

isPrime <- function(x){
    div <- 2:floor(sqrt(x))
    !any(x %% div == 0)
}
John