tags:

views:

1501

answers:

10

The question is to find the 1000th prime number. I wrote the following python code for this. The problem is, I get the right answer for the 10th , 20th prime but after that each increment of 10 leaves me one off the mark. I can't catch the bug here :(

count=1            #to keep count of prime numbers
primes=()          #tuple to hold primes
candidate=3        #variable to test for primes
while count<20:
    for x in range(2,candidate):
        if candidate%x==0:
            candidate=candidate+2
        else : pass
    primes=primes+(candidate,)            
    candidate=candidate+2
    count=count+1
print primes        
print "20th prime is ", primes[-1]

In case you're wondering, count is initialised as 1 because I am not testing for 2 as a prime number(I'm starting from 3) and candidate is being incremented by 2 because only odd numbers can be prime numbers. I know there are other ways of solving this problem, such as the prime number theorem but I wanna know what's wrong with this approach. Also if there are any optimisations you have in mind, please suggest.

Thank You

+3  A: 

I don't think this is testing what you think it's testing. It looks like you're trying to say "for each number between 2 and my candidate, check to see if the candidate is evenly divisible by that number". However, when you find a prime (candidate%x == 0), you're only incrementing the candidate-- you still need to start your "for x in ..." loop over again, since the candidate has changed.

That's what I can see from the code as written; there are of course lots of other ways and other optimizations to use here.

quixoto
+4  A: 

There is lots (!) to be improved with your Python code, but to answer your specific question:

When you have found a divisor (candidate % x == 0), you increment the candidate, but you don't do anything about x. This can lead to two problems:

  1. candidate might have a divisor, but it's smaller than any x that gets tested -- because testing in the next iteration of the loop starts with x being one higher than the value x had before; not at 2.
  2. candidate might have a divisor, but it's larger than x ever gets, because you take x from the values from 2 to the value candidate had when you started the loop.
balpha
A: 

As for optimizations, if you're definite that you want to follow this implementation, you could avoid looking at numbers that:

  1. end in 5, since they are divisible by 5.
  2. are comprised by the same digits, e.g. 22, 33, 44, 55, 66, etc., since they are divisible by 11.

Don't forget to add 5 and 11 as primes though!

Alex
it would be easy if the divisors could be a list of previously collected primes. I think I'll go with that approach. Just to get it down in code now!!
gsin
@Alex: Assuming the OP is using a binary computer, checking whether the last 10-base digit is 5 comes down to the same thing as dividing by 5 in the first place. A similar argument holds for your second point.
balpha
#2 is not really an optimization on a binary computer. #1 can be generalized to implement something called a 'wheel'. You take the first k primes, say for example 2, 3, 5, and then only try candidates that are not 0 mod any of these primes. So, mod 30 that means you only try candidates have the following remainders mod 2*3*5: [1, 7, 11, 13, 17, 19, 23, 29]. Thus, you can have something like this:primes=[2,3,5];for base in xrange(0, 10000000, 2*3*5): for rem_mod30 in [1, 7, 11, 13, 17, 19, 23, 29]: candidate = base + rem_mod30 # check if candidate is prime
GregS
A: 

Unless I'm very much mistaken, you are always adding the current candidate to the list of primes, regardless of whether a divisor has been found or not. Your code to append to the list of primes (putting aside the immutable tuple comment made earlier) is outside of the test for integer divisors, and therefore always run.

Tom Woolfrey
+2  A: 

It's good to know that every prime number bigger than 3 can be written as: 6k-1/+1.

When you are looking for the next candidate, you can always write something like this (code snippet is in C):

a=1;
...
candidate=6*k+(a=(a==-1)?1:-1);
if(a==1){
           k++;
}

And a function I've used not so long ago to determine the nth prime number, where LIM is the nth number you are looking for (C code):

int sol2(){
        int res,cnt,i,k,a;
        res=-1;
        i=1;
        cnt=3;
        k=1;
        a=1;
        while (1){
                if (util_isprime(cnt)){
                        i++;
                        if (i==LIM){
                                res=cnt;
                                break;
                        }
                }
                /* 6k+/-1 starting from 6*1-1 */
                cnt=6*k+(a=(a==-1)?1:-1);
                if(a==1){
                        k++;
                }
        }
        return res;
}
Andrei Ciobanu
+2  A: 

in the statement:

for x in range(2,candidate)

you can reduce the number of iterations by scanning up to sqrt(candidate)

If candidate is divisible by x, then we can write candidate=x*b for some b. If x is less than or equal to b, then x must be smaller than or equal to the square root of candidate

Pierre
A: 

If you want anything remotely efficient, do the Sieve of Eratosthenes - it's as simple as it's old.

MAX = 10000
candidates = [True] * MAX
candidates[0] = False
candidates[1] = False

primelist = []
for p,isprime in enumerate(candidates):
    if isprime:
        primelist.append(p)
        for n in range(2*p,MAX,p):
            candidates[n] = False

print primelist[1001]
THC4k
I have seen this, but what if I want the first 20,000 prime numbers? It is hard to pick the MAX then. Also, what if I want to go on indefinitely?
Hamish Grubijan
there are about `x/log x` primes below `x` so you can just compute the MAX. There are other algorithms to find very large primes.
THC4k
+5  A: 

There is a nice Sieve of Eratosthenes generator implementation in test_generators.py:

def intsfrom(i):
     while 1:
         yield i
         i += 1

def firstn(g, n):
     return [g.next() for i in range(n)]

def exclude_multiples(n, ints):
     for i in ints:
         if i % n:
             yield i    

def sieve(ints):
     prime = ints.next()
     yield prime
     not_divisible_by_prime = exclude_multiples(prime, ints)
     for p in sieve(not_divisible_by_prime):
         yield p

primes = sieve(intsfrom(2))

>>> print firstn(primes, 20)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71]

alt text

jbochi
Actually this code is not an implementation of the Sieve of Eratosthenes, as it computes the remainder of each candidate divided by every prime already found... The SoE performs *no divisions* at all, just addition (which is all that is necessary to cross out numbers at fixed increments). There is a nice paper by Melissa E. O'Neill -- "The Genuine Sieve of Eratosthenes" -- which discusses the difference between this type of naive implementation and a real SoE in the context of Haskell, though it is not at all Haskell-specific, or even FP-specific, and could easily be rewritten in Python.
Michał Marczyk
Continuing past the comment length limit... There's a *huge* difference in performance between a proper SoE and this kind of naive sieve, although that certainly doesn't prevent the naive sieve from being quite appropriate in the context of testing Python's generator implementation. :-)
Michał Marczyk
@Michal. Very well noticed! It's not the real Sieve indeed. Thanks.
jbochi
A: 

FYI... I solved it with the following code, though it can be optimised much much more, I just wanted to solve it in this manner first. Thanks everyone for your help.

from math import *
primes=[2,3]
count=2
testnumber=5
while count<1000:

    flag=0
    for x in range(2,testnumber):
        if x<=sqrt(testnumber):
            if testnumber%x==0:
                #print testnumber , "is not a prime"
                flag=1

            else : pass
    if flag!=1:
        #print testnumber , "is a prime"
        primes=primes+[testnumber]
        count=count+1
    testnumber=testnumber+2


#print primes
print "1000th prime is ", primes[-1]

I will now look at all the other algorithms mentioned by you all

gsin
A: 
Saikot