views:

525

answers:

6

It's an implementation of Sieve of Eratosthenes.

class PrimeGenerator
  def self.get_primes_between( x, y)
    sieve_array = Array.new(y) {|index|
      (index == 0 ? 0 : index+1)
    }

    position_when_we_can_stop_checking = Math.sqrt(y).to_i
    (2..position_when_we_can_stop_checking).each{|factor|
      sieve_array[(factor).. (y-1)].each{|number|
        sieve_array[number-1] = 0 if isMultipleOf(number, factor)
      }
    }

    sieve_array.select{|element| 
      ( (element != 0) && ( (x..y).include? element) )
    }
  end
  def self.isMultipleOf(x, y)
    return (x % y) == 0
  end
end

Now I did this for a 'submit solutions to problems since you have time to kill' site. I chose ruby as my impl language.. however i was declared timed out. I did some benchmarking

require 'benchmark'
Benchmark.bmbm do |x|
  x.report ("get primes") { PrimeGenerator.get_primes_between(10000, 100000)}
  end

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]

L:\Gishu\Ruby>ruby prime_generator.rb
Rehearsal ----------------------------------------------
get primes  33.953000   0.047000  34.000000 ( 34.343750)
------------------------------------ total: 34.000000sec

                 user     system      total        real
get primes  33.735000   0.000000  33.735000 ( 33.843750)

ruby 1.8.6 (2007-03-13 patchlevel 0) [i386-mswin32]

Rehearsal ----------------------------------------------
get primes  65.922000   0.000000  65.922000 ( 66.110000)
------------------------------------ total: 65.922000sec

                 user     system      total        real
get primes  67.359000   0.016000  67.375000 ( 67.656000)

So I redid the thing in C# 2.0 / VS 2008 --> 722 milliseconds

So now this prods me into thinking is it a problem with my implementation or is the perf diff between languages this wide? (I was amazed with the 1.9 Ruby VM... until I had to go compare it with C# :)

UPDATE: Turned out to be my "put-eratosthenes-to-shame-adaptation" after all :) Eliminating unnecessary loop iterations was the major optimization. In case anyone is interested in the details.. you can read it here; this question is too long anyways.

+2  A: 

The Sieve of Eratosthenes works fine as illustrative way to find primes, but I would implement it a little bit different. The essence is that you don't have to check numbers which are multiples of already known primes. Now, instead of using an array to store this information, you can also create a list of all sequential primes up to the square root of the number you are checking, and then it suffices to go through the list of primes to check for primes.

If you think of it, this does what you do on the image, but in a more "virtual" way.

Edit: Quickly hacked implementation of what I mean (not copied from the web ;) ):

  public class Sieve {
    private readonly List<int> primes = new List<int>();
    private int maxProcessed;

    public Sieve() {
      primes.Add(maxProcessed = 2); // one could add more to speed things up a little, but one is required
    }

    public bool IsPrime(int i) {
      // first check if we can compare against known primes
      if (i <= primes[primes.Count-1]) {
        return primes.BinarySearch(i) >= 0;
      }
      // if not, make sure that we got all primes up to the square of i
      int maxFactor = (int)Math.Sqrt(i);
      while (maxProcessed < maxFactor) {
        maxProcessed++;
        bool isPrime = true;
        for (int primeIndex = 0; primeIndex < primes.Count; primeIndex++) {
          int prime = primes[primeIndex];
          if (maxProcessed % prime == 0) {
            isPrime = false;
            break;
          }
        }
        if (isPrime) {
          primes.Add(maxProcessed);
        }
      }
      // now apply the sieve to the number to check
      foreach (int prime in primes) {
        if (i % prime == 0) {
          return false;
        }
        if (prime > maxFactor) {
          break;
        }
      }
      return true;
    }
  }

Uses about 67ms on my slow machine.... test app:

class Program {
 static void Main(string[] args) {
  Stopwatch sw = new Stopwatch();
  sw.Start();
  Sieve sieve = new Sieve();
  for (int i = 10000; i <= 100000; i++) {
   sieve.IsPrime(i);
  }
  sw.Stop();
  Debug.WriteLine(sw.ElapsedMilliseconds);
 }
}
Lucero
+4  A: 

I'd start by looking at your inner loop. sieve_array[(factor).. (y-1)] is going to create a new array each time it's executed. Instead, try replacing it with a normal indexing loop.

Dave Ray
This shaved off 8 secs off my time.
Gishu
A: 

I would also note that Ruby, in my experience, is a lot slower on Windows systems than on *nix. I'm not sure what speed processor you have, of course, but running this code on my Ubuntu box in Ruby 1.9 took around 10 seconds, and 1.8.6 took 30.

Greg Campbell
A: 

Benchmark it with ruby-prof. it can spit out things tools like kcachegrind can look at to see where your code is slow.

Then once you make the ruby fast, use RubyInline to optimize the method for you.

Daniel Huckstep
+4  A: 

Obviously each computer is going to benchmark this differently, but I was able to make this run approximately 50x faster on my machine (Ruby 1.8.6) by removing the looping on the array with an each block, and by causing the inner loop to check less numbers.

factor=2
while factor < position_when_we_can_stop_checking 
    number = factor
    while number < y-1
      sieve_array[number-1] = 0 if isMultipleOf(number, factor)
      number = number + factor; # Was incrementing by 1, causing too many checks
    end
  factor = factor +1
end
mmc
Yes. jumping in steps of factor was another time-saver.
Gishu
+2  A: 

I don't know how it compares for speed, but this is a fairly small and simple SoE implementation that works fine for me:

def sieve_to(n)
  s = (0..n).to_a
  s[0]=s[1]=nil
  s.each do |p|
    next unless p
    break if p * p > n
    (p*p).step(n, p) { |m| s[m] = nil }
  end
  s.compact
end

There are a few further little speedups possible, but I think it's pretty good.

They're not exactly equivalent, so your 10_000 to 1_000_000 would equate to

sieve_to(1_000_000) - sieve_to(9_999)

or something closely approximate.

Anyhow, on WinXP, with Ruby 1.8.6 (and fairly hefty Xeon CPUs) I get this:

require 'benchmark'
Benchmark.bm(30) do |r|
  r.report("Mike") { a = sieve_to(10_000) - sieve_to(1_000) } 
  r.report("Gishu") { a = PrimeGenerator.get_primes_between( 1_000, 10_000) }
end

which gives

                                    user     system      total        real
Mike                            0.016000   0.000000   0.016000 (  0.016000)
Gishu                           1.641000   0.000000   1.641000 (  1.672000)

(I stopped running the one million case because I got bored waiting).

So I'd say it was your algorithm. ;-)

The C# solution is pretty much guaranteed to be orders of magnitude faster though.

Mike Woodhouse
Thank you. Understanding your snippet was fun.. felt the joy of being a programmer again. Also its brevity is inline with the Ruby way
Gishu
Also..Any reason why you picked sieve_to(y)-sieve_to(x-1) to get the results instead of sieve_to(y).select{|prime| (prime >= x)} ?
Gishu
Thx. I've been playing with Project Euler: several problems need large lists of primes, so I've tuned about as far as I'm able.Um. The sieve_to() - sieve_to() thing was literally the first thing that came into my head when I noticed between(x,y). Yours might be better for large x. Alternatively, change the sieve_to to compact the array from x upwards. :)
Mike Woodhouse