views:

25260

answers:

27

I've heard that Python is meant to be faster than PHP in terms of Runtime, I simply took this as a given and sat down today to make a blog post about it. After being told by Vinko Vrsalovic how to time scripts I took converted some code for getting prime numbers into Python and PHP then ran each 3 times and recorded the numbers. All times are in seconds.

Python => 144.829, 144.771, 144.862 (Average 144.8206)
PHP    => 102.783, 100.707, 100.663 (Average 101.3843)

I tried 3 different methods of storing the output in Python but they made a difference of approximately 2 seconds and when both scripts were set to output the data as soon as they got it rather than all at once the results were also only a few seconds off the above numbers.

Was all the stuff I heard about Python being faster wrong or have I done something appalling with my Python code?

Here is the Python code

#!/usr/bin/env python
primeNumbers = []
output = []

for i in xrange(2, 100000):
 divisible = False

 for number in primeNumbers:
  if i % number == 0:
   divisible = True


 if divisible == False:
  primeNumbers.append(i)
  output.append(str(i))

print ''.join(output)

And here is the PHP code

#!/usr/bin/env php
<?php
$primeNumbers = array();
$output = '';

for ($i = 2; $i < 100000; $i++)
{
 $divisible = false;

 foreach ($primeNumbers as $number)
 {
  if ($i % $number == 0)
  {
   $divisible = true;
  }
 }

 if ($divisible == false)
 {
  $primeNumbers[] = $i;
  $output .= $i;
 }
}

echo $output;
?>

All tests were run with the following command and under near identical conditions

$ time ./script.ext
+2  A: 
for i in range(2, 100000):

try replacing this with

for i in xrange(2, 100000):

range creates a list, xrange returns a generator. this will use less memory and take less time.

In python 3, range will act like xrange.

1729
Python's average time drops to 139.779, while nice it's not a killer part :)
Teifion
+40  A: 

First of all, this is a very artificial test. The only building blocks you're measuring here are loops, integer division, array addition, and output.

The two scripts you pasted are not doing the same thing; the PHP one is outputting all numbers at the end (without any separators in between, even), while the python one is outputting one line for each number as it finds it.

Thus if these are the actual scripts, you're not comparing the same code flow.

I changed your scripts to be more similar. I also added a break after the divisor test, showing that a simple algorithmic change makes both scripts an order of magnitude faster (a factor of 10 on my machine).

The new PHP script:

#!/usr/bin/env php
<?php
$primeNumbers = array();
$output = '';
$start = microtime(TRUE);

for ($i = 2; $i < 100000; $i++)
{
        $divisible = false;

        foreach ($primeNumbers as $number)
        {
                if ($i % $number == 0)
                {
                        $divisible = true;
                        break;
                }
        }

        if ($divisible == false)
        {
                $primeNumbers[] = $i;
                $output .= $i;
        }
}

echo count($primeNumbers), "\n";
echo "time: ", microtime(TRUE) - $start, "\n";
?>

The new python script:

#!/usr/bin/env python
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4

import time

start = time.time()
primeNumbers = []
output = ""

for i in xrange(2, 100000):
    divisible = False

    for number in primeNumbers:
        if i % number == 0:
            divisible = True
            break

    if divisible == False:
        primeNumbers.append(i)
        output += str(i)

print len(primeNumbers)
print 'time: %f' % (time.time() - start)

Here are the runtimes:

  • Python, 100000 items, little output, timing inside script: 15.324324, 15.923104, 15.096976
  • PHP, 100000 items, little output, timing inside script: 9.3562700748444, 9.6537330150604, 11.526440143585
  • Python, same but using xrange: 14.315210, 17.380582, 14.081702 (interestingly enough, seems faster but a wider variation)

Your basic statement, for this contrived case, still seems true - the PHP version is faster. But unless you are writing a prime number generating script, this test will not say much about speed of PHP versus Python.

As an aside - this question shows, more than anything, that it's not the runtime that matters - it's the algorithm. There are much faster algorithms to find all prime numbers up to a given number :)

Here's a faster one in Python, implementing a Sieve of Eratosthenes

#!/usr/bin/env python
# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4

import time

start = time.time()
TOP = 100000
primes = {}
output = ""

for i in range(2, TOP):
    primes[i] = True

for i in range(2, TOP):
    if primes[i]:
        for j in range(i * 2, TOP, i):
            primes[j] = False

for i in range(2, TOP):
    if primes[i]:
        output += str(i)

print 'time: %f' % (time.time() - start)

Runtime for this script: 0.264818

I'd say that's a 60 times improvement on the 10 times improvement I got after adding the break over the original version, but with so small a runtime the margin for error is a bit too big (though it's likely it's actually in favour of the slower tests).

Calculating with an upper limit of 100 times more, this script runs in 30 seconds.

One of the reasons why this script is faster is that it runs in O(n) time, while the original version (with a nested loop) runs in O(n²) time.

The sieve version consumes more memory. For the PHP version to work, I actually had to create a custom .ini to make php use more than 32 MB. Not pasting my PHP version since I'm not really that good a PHP coder and I wouldn't want to skew results with a slow PHP version written by me.

Thomas Vander Stichele
I have now edited the Python script to the state it was when I ran the first tests so should be more akin to the PHP script. I eagerly await your results :)
Teifion
I think the Python version would be even faster if you did not do the string concatenation in the loop..
kigurai
A more concise version of Sieve of Eratosthenes in Python http://stackoverflow.com/questions/188425/project-euler-problem#193605
J.F. Sebastian
Actually no - it doesn't run in `O(n)`. It's not `n²/2` anymore, but it's still `O(n(logn)(loglogn))` just because of `for i in range(2, TOP): ... for j in range(i * 2, TOP, i):` Apparently there's a segmented version of sieve that uses `O(n)`.
viraptor
Use the append method of a list rather than string concatenation. Like this: "output.append(str(i))", and finally "''.join(output)". Strings of Python are immutable, doing that in loop, malloc and free will invoked repeatly, how can it be fast?
Victor Lin
+1  A: 

Put a break statement after the "divisible = true" bits, you'll be amazed at the time saved.

Nouveau
I will be back in 10 minutes having tried this, well done for spotting that :)
Teifion
+3  A: 

I actually did a benchmark with a few simple items, mostly to do with split strings and array manipulation. I did it with Python, PHP, Ruby and Perl.

What I wanted to accomplish was to see how much faster PHP was than Ruby, and if PHP was faster than Perl, why I did Python was just because I could :)

What I found out is that Python actually was the fastest by a large factor, and Ruby was more than twice as fast as PHP, and PHP totally failed on 1.000.000 lines while everything else ran just fine.

Here is the result

$ time php readArray.php
^C

real    5m1.741s
user    5m1.086s
sys 0m0.640s

time ruby readArray.rb
$ time ruby readArray.rb
Stærðin á fylkinu er : 1000000

real    0m0.541s
user    0m0.487s
sys 0m0.052s

time perl readArray.pl
$ time perl readArray.pl
Size of array is 1000000

real    0m0.726s
user    0m0.565s
sys 0m0.161s

$ time python readArray.py
my arr is :
1000001

real    0m0.179s
user    0m0.126s
sys 0m0.053s

Here is the code for PHP:

<?php
$file = file_get_contents ("testfile");
$fylki = split("\n", $file);

print "Count is :".count($fylki)."\n";

?>

And ruby:

a = File.read("testfile")
myarray = a.split("\n")
puts "Arraysize is : " << myarray.size.to_s

And Python:

a = open("testfile").read() # or just open("testfile").readlines() # it keeps '\n'    
myarr = a.split("\n")
print "my arr is :", len(myarr)

So my conclusion was that PHP was not as fast and powerful as the rumor is, and Ruby is not slow at all, which was exactly what I as a PHP guy did not want to see.

But being a Ruby fan this convinced me to look better at Ruby. I have no idea why I am not turning to Python, as it was the fastest by a large margin and being quite simple, I have no idea.

P.S.: Just added the Python time.

Trausti Thor Johannsson
You don't have the results for the Python script there ;)
Teifion
I've repeated your test http://stackoverflow.com/questions/62333/python-vs-php-python-runs-slower#346881
J.F. Sebastian
have you tried how much a difference there would be, if you would use the PHP-native function to do what you just did: file() ?
Henrik Paul
What about posting the Perl version.
Brad Gilbert
PHP is apparently fast enough for some of the worlds biggest websites... Flickr and Facebook for example.All these languages are in the same order of magnitude for speed. Just use whatever works.
TM
What a contrived, ignorant benchmark. The reason PHP is so slow in this benchmark is that the function split parses the pattern as a POSIX regular expression, which is slow as hell. Try using the native function file() or explode()
orlandu63
Split is the default function for splitting strings, look in any manual.This is not ignorant test at all, but simple liners to test string functions, and all scripts looking very similar using what I consider to be the default functions. My test was not to prove how slow php was, but how much faster php was against ruby, and ruby proved to be significantly faster which was just great but opposite of what I was aiming for.
Trausti Thor Johannsson
He's a troll. He is a Rubinian :) Nice try Trausti.
orokusaki
this apples-to-grapefruit "benchmark" should really either be comparing python re.split to PHP split, or python ''.split to php explode. Worst answer of the bunch, sorry.
Jimmy
Yeah this benchmark is crap, shouldn't be marked as "answer" to this question.
Infinity
+4  A: 

In summary, do the following changes to your Python code (do not output on every iteration, and generally, ignore output time), then the relevant changes to your PHP code, and then try again. Also, as suggested already, try real-life (as defined by your project needs) benchmarks.

for i in xrange(2, 100000):
    for number in primeNumbers:
        if i % number == 0:
            break
    else:
        primeNumbers.append(i)

Note: xrange, changes to the for loop, removal of output

ΤΖΩΤΖΙΟΥ
+1  A: 

Change range to xrange. More importantly do not do 'print i' in loop.

Why not print out prime number list (primenumbers) at last. Also I think you should exclude output/printing part from timing.

You can also improve both scripts by breaking out of inner for loop when you find a divisor. E.g.:

    for number in primeNumbers:
        if i % number == 0:
            divisible = True
            break
Anurag Uniyal
+4  A: 

Here is an excellent introductory article to optimization in python.

Toni Ruža
+3  A: 

The Python version was creating a list, adding every prime to it, then having to transform that list into a string.

You can make this a bit cleaner by doing..

primes = []

for i in xrange(2, 100000):
    divisible = False

    for number in primes:
        if i % number == 0:
            divisible = True
            break

    if not divisible:
        primes.append(i)

print "".join([str(x) for x in primes])

I added a break statement to both scripts, which stops it needlessly testing every possible number - as soon as we know it divides by something, there's no point testing it any further.

You can probably improve either scripts performance - really this test doesn't really prove anything. At basic stuff like looping, appending to strings etc, most languages perform pretty much the same.

Regardless of benchmark-speed, I would argue that Python is always faster, because I can write code in it faster, not because of the interpreter speed.. but that's a different matter entirely!

As an aside, if you are doing stuff like this in Python (where speed matters), I recommend a combination of three things:

  • Look into systems like Psyco, or Numpy.
  • Learn to write modules for Python in C. That way you get the benefits of rapid development - write everything in Python first, then when you see performance issues, you can get benefits of compiled language, without reinventing the entire wheel.
  • Learn/find/research and implement better algorithms, which will help far more "a faster language". In this case, you can find primes far faster than brute-forcing through several loops.
dbr
Making output a string instead of a list actually makes the script slower in Python. Python str objects are immutable, so are copied every time you append to them.
Will Harris
Also, I the python script is not appending to the list of known primes, which probably accounts for the bulk of the speedup.
Will Harris
..oops, I knew I screwed something up in the Python script.
dbr
I've removed the entire part about Python being faster. It actually makes my point clearer..
dbr
`''.join(str(x) for x in primes)` -- there is no need to create a list.
J.F. Sebastian
+1  A: 

In general PHP tends to be a little faster than Python, mainly due to most of phps functions being implemented in C, whereas much of Python is implemented in Python. People tend to talk about Python being faster in terms of development time rather than performance. If you are having performance problems, and you know where your code is slow (profile if you dont), try finding a C implementation of the library you are using (for example python-cjson instead of simplejson for json encoding/decoding), or experiment with python optimization tools like psyco.

Sean O Donnell
+1  A: 

I would also note that Python byte-compiles the input.

If your code did a lot more interesting things than loops and simple maths and employed several modules etc, you would notice a massive increase in speed over the PHP instance. Your first run of the script would be somewhat slow initially, but each subsequent execution would use the byte compiled modes. (the generated *.pyc files).

You can further optimise the byte compiled output from Python by using the -OO switch.

pobk
There will be no massive increase in speed for cpu-intensive scripts in Python. Time required to byte-compile such scripts are negligible compared to total execution time. "*.pyc" files make sense mainly for library modules (such as modules from the standard library).
J.F. Sebastian
`-OO switch` will not bring significant speed improvements. It just turns off `assert` statements and nothing more (as far as I know).
J.F. Sebastian
+1  A: 

A prime number is only divisible by 1 and itself. So

if ( (2 % $number == 0) || ($i % $number == 0) )

for any

$i <= $number

then the number is not a prime number. Therefore, as mentioned in other answers, issuing a break statement when

$i % $number == 0

is true, you save a lot of time.

jsumners
+18  A: 

There are lies, damn lies, and benchmarks. That said, the Computer Language Benchmarks Game offers a much broader response to your question.

Alec Thomas
Interesting... Python beats PHP in all but 3 tests. Those 3 tests all involve lots of arithmetic which agrees with the OP's test too. So Python appears generally faster than PHP except when doing arithmetic.
Nick Craig-Wood
There is a huge difference between PHP arithmetic and Python arithmetic. PHP arithmetic is low-level 32 or 64 bit arithmetic and Python's numbers are limited only by available memory.
Mike Korobov
Plus the fact that python numbers are objects... Python comparisons are expensive, too.
Wayne Werner
A fuller quotation is perhaps more to the point - "After all, facts are facts, and although we may quote one to another with a chuckle the words of the Wise Statesman, 'Lies--damned lies--and statistics,' still there are some easy figures the simplest must understand, and the astutest cannot wriggle out of." Leonard Henry Courtney, 1895
igouy
+2  A: 

The Python interpreter has quite a significant startup cost (more so than PHP) which you are including in your benchmark. Try timing things from within your script instead:

import time
start = time.time()
# compute numbers here
print "Took", (time.time() - start)

<?php
$start = microtime(true);
/* Compute numbers here */
print "Took " . microtime(true) - $start;
?>
Simon Willison
+1  A: 
  1. You are including interpreter startup times with your scripts. That is not fair in most cases. PHP for example is always started up in the webpage serving context (where it is most used)
  2. Also you have a different IO profile for both the scripts -- the big bottleneck in your python code is multiple IO as opposed to a single IO in the PHP Code. You need to fix that.

Please retry the benchmark without the interpreter / IO times on your own code and look at the results.

+2  A: 

PHP: 83.43 sec Python: 6.43 sec

<?php

set_time_limit(1000);

$tt=mfl();

$primeNumbers = array();
$output = '';

for ($i = 2; $i < 100000; $i++)
{
        $divisible = false;

        foreach ($primeNumbers as $number)
        {
                if ($i % $number == 0)
                {
                        $divisible = true;
                }
        }

        if ($divisible == false)
        {
                $primeNumbers[] = $i;
#                $output .= $i;
        }
}

#echo $output;

print mfl()-$tt;

function mfl() {
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}


import time
tt=time.time()


def a():
    primeNumbers = []
    output = []
    for i in xrange(2, 100000):
      divisible = False

      for number in primeNumbers:
        if i % number == 0:
          divisible = True


      if divisible == False:
        primeNumbers.append(i)
        #output.append(str(i))

    return primeNumbers

import psyco
psyco.full()

txt=''.join([str(i) for i in a()])
#print txt

print time.time()-tt
yoihj
If you break after the `divisible = True` line, the execution time seems to drop below one second..
dbr
A: 

Those 8-space indents really slow python down.

Ali A
Are you sure about that?
Teifion
Exactly, so use tabs!
Arafangion
(And yes, I've downvoted this answer)
Arafangion
And I have downvoted your sense of humour.
Ali A
@Arafangion No way man. Python's style guide says use spaces rather than tabs. @ Ali A Python's interpreter isn't slower because of spaces. The fact that Python doesn't use { } and things like that for function bodies itself causes it to be slower than if it did use them, but that's the whole point of Python: Not a huge mess like other languages.
orokusaki
Ahh, yesss... Being whitespace sensitive really does slow it down as well.
Arafangion
+2  A: 
  1. Your unmodified scripts take 200 seconds on my machine (both php and python).

    $ php --version
    PHP 5.2.6 (cli) (built: May  2 2008 18:02:07) 
    
    
    $ python --version
    Python 2.5.2
    
  2. Modified (more similar in functionality and faster (to avoid waiting)) versions take 20 seconds (php being slightly faster). Python script interpreted by jython takes 50 seconds (Jython 2.2.1 on java1.6.0_10-beta). Both versions php and python append integers and strings. There is no difference between range and xrange in this case.

    #!/usr/bin/env python
    primeNumbers = [2]
    output = '2'
    for number in range(3, 100000, 2):
        for divisor in primeNumbers:
            if number % divisor == 0:
                break
        else:
            primeNumbers.append(number)
            output += str(number)
    #
    print output
    
    
    #!/usr/bin/env php
    <?php
    $primeNumbers = array(2);
    $output = '2';
    for ($i = 3; $i < 100000; $i += 2) {
      $prime = true;
      foreach ($primeNumbers as $number) {
        if ($i % $number == 0) {
          $prime = false;
          break;
        }
      }
      if ($prime) {
        $primeNumbers[] = $i;
        $output .= $i;
      }
    }
    echo $output;
    ?>
    
  3. The third version is a quick-and-dirty (memory-hungry, unoptimized) python script which I would actually use to print concatenated primes less then 100000. It takes 0.7 seconds (0.13 seconds without printing).

    #!/usr/bin/env python
    def iprimesupto(limit):
        isprime = [False]*2 + [True]*(limit - 2)
        for n in range(limit):
            if isprime[n]:
                yield n
                for i in range(n*n, limit, n):
                    isprime[i] = False
    #
    print ''.join(str(p) for p in iprimesupto(100000))
    
J.F. Sebastian
+5  A: 

One of my friends showed this page to me and I was rather intrigued. I went to do some research and found out that it was the difference in the loops that caused the substantial difference in time. I was able to modify the two scripts and they now both run in somewhat the same amount of time. I already spent my time writing it up at my blog http://www.akngo.com/webdev/php-faster-than-python/.

In short, the difference was in the for and the foreach loop.

Anh-Kiet Ngo
PHP is 100 times slower than Python ;) http://stackoverflow.com/questions/62333/python-vs-php-python-runs-slower#346881
J.F. Sebastian
It's a good blog post.
Teifion
Very interesting, the foreach loop is one of the most valuable features to web applications in my opinion
Fire Crow
+4  A: 

I've repeated @Trausti Thor Johannsson's test.

| Language |       Time |
|          | in seconds |
|----------+------------|
| perl     |        1.9 |
| python   |        4.2 |
| ruby     |        5.0 |
| php      |      > 600 |

Input file was generated by:

$ perl -E"say for(1..1000_000)" >1M.input

All scripts resemble a php version i.e., the code is not idiomatic.

$ cat read_array.*

#!/usr/bin/env php
<?php
$text  = file_get_contents ("1M.input"); #TODO: argv support
$lines = split("\n", $text);
print "wc -l: ".count($lines)."\n";
?>

#!/usr/bin/perl -w
$filename = @ARGV == 1 ? $ARGV[0] : '1M.input';
{
    open $fh, "<", $filename or die "can't open '$filename' $!";
    undef $/;
    $text  = <$fh>; # php-like version
    @lines = split "\n", $text; 
    print "wc -l: ". @lines ."\n";
}

#!/usr/bin/env python
import sys
filename = sys.argv[1] if len(sys.argv) == 2 else '1M.input'
text = open(filename).read() # or just readlines() (it keeps '\n')
lines = text.split('\n')
print("wc -l: ", len(lines))

#!/usr/bin/env ruby
filename = ARGV.size == 1 ? ARGV[0] : '1M.input'
text  = File.read filename 
lines = text.split "\n"
puts "wc -l: " << lines.size.to_s

This test confirms yet another time: all benchmarks are evil (PHP is at least 100 times slower than python in this test, but I wouldn't go and write a blog post about it due to there are other tests that will show different picture)

J.F. Sebastian
+4  A: 

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&amp;lang=all

have some good benchmarks!

Kknd
+2  A: 

Rant Part

Let us assume PHP is faster than Python.

Performance is a cost. That means, say for web applications, you will need to throw in more servers (therefore mone money).

Development is a cost as well. It takes time. And the F(c)->t function, where c is complexity and t is time needed, is different for PHP and Python. (Time requirement doesn't increase linearly with complexity)

Developers are resources, so they cost as well. Would you rather code PHP instead of Python? Would your expectations (of performance) be the same for two types of programmers? (I mean, programmers specialized in these languages)

Answer Part

To conclude Python is slower, we would need better benchmarks. Brute forcing prime numbers is not a good benchmark IMHO. Benchmarking with any other unit of language is not good as well.

These languages have a big difference in expressiveness. IMHO a performance test for some (sanely) complex task would be more suitable. It would also show that expressiveness is inversely reverse proportional to code length.

NOTE: Something tells me this will be downvoted to oblivion. I wouldn't want that, but I would agree this answer is not helpful in proportion to the question.

EDIT: Fixed formatting. 4th level heading don't show up when submitted. Shouldn't the preview show us how it will look when posted?

muhuk
That's what I thought, too... but when I was trying to format with whitespace, what it showed me in the preview was way different than the end result >.<
Wayne Werner
+1  A: 

This doesn't significantly affect the speed, but your code could be a lot simpler:

primes = []
for i in xrange(2, 100000):
    if all(i % number != 0 for number in primes):
        primes.append(i)
print ''.join(map(str, primes))

This is the same algorithm as you'd have with the proper break after divisible = True.

Darius Bacon
+1  A: 

Python is more readable in my view than PHP. :) Talking about speed, I can write much faster and way better in python then in PHP.

+2  A: 

Obligatory link to Python vs PHP at Computer Language Benchmark Game aka the Shootout.

Python vs PHP (2009-12-29)

Python programs take more or less the same time to run (usually a little bit faster, but not always), but almost always they require less memory and are much shorter.

One problem is not enough to judge.

jetxee
A: 

I only have 2 words to say: "Unladen Swallow"

orokusaki
A: 

python numbers are limited only by available memory and php numbers are limited by maxint (+-2^31 on 32bit or +-2^63 on 64bit).

So if you want fare benchmark php's 'bcmath' module should be used.

Mike Korobov
integer numbers in python are stored in normal C ints. You need to specifically declare a number as a long or put decimal points in it for it to use arbitrary precision numbers.
Phillip Whelan
You are correct only for python versions < 2.2. Python 2.2 was released about 9 years ago. Please check pep-0237.
Mike Korobov
A: 

The Python version from in the question does not represent accurately the PHP version. The xrange function makes the two scripts completely different. Python's "for" statement cannot be used to represent PHP's "for" statement, mainly because python iterates over an array while PHP is just increasing a counter. In order represent the PHP script properly in Python you have to use a while statement.

The right Python code to represent the PHP code would be:

primeNumbers = []
output = ''

i = 2
while i < 100000:
        divisible = False

        for number in primeNumbers:
                if (i % number == 0):
                        divisible = True

        if divisible == False:
                primeNumbers.append(i)
                output += str(i)

        i += 1
print output

BTW: This code runs in 90.664941 seconds in my machine. (I know this info is useless...)

Jose Cano
wrong - xrange() is not an array
Nas Banov