ansaurus

Question

Why is my Python version slower than my Perl version?

Answer 1

+4 A:

Python is not particularly fast at numeric computations and I'm sure it's slower than perl when it comes to text processing.

Since you're an experienced Perl hand, I don't know if this applies to you but Python programs in the long run tend to be more maintainable and are quicker to develop. The speed is 'enough' for most situations and you have the flexibility to drop down into C when you really need a performance boost.

Update

Okay. I just created a large file (1GB) with random data in it (mostly ascii) and broke it into lines of equal lengths. This was supposed to simulate a log file.

I then ran simple perl and python programs that search the file line by line for an existing pattern.

With Python 2.6.2, the results were

real    0m18.364s
user    0m9.209s
sys 0m0.956s

and with Perl 5.10.0

real    0m17.639s
user    0m5.692s
sys 0m0.844s

The programs are as follows (please let me know if I'm doing something stupid)

import re
regexp = re.compile("p06c")

def search():
    with open("/home/arif/f") as f:
        for i in f:
            if regexp.search(i):
                print "Found : %s"%i

search()

and

sub search() {
  open FOO,"/home/arif/f" or die $!;
  while (<FOO>) {
    print "Found : $_\n" if /p06c/o;
  }
}

search();

The results are pretty close and tweaking it this way or other don't seem to alter the results much. I don't know if this is a true benchmark but I think it'd be the way I'd search log files in the two languages so I stand corrected about the relative performances.

Thanks Chris.

Noufal Ibrahim 2009-12-31 10:41:57

i disagree with the "it's slower than perl when it comes to text processing" statement. Before you say that, you should provide your own benchmark testing for justification

ghostdog74 2009-12-31 10:59:38

"Python programs in the long run tend to be more maintainable and are quicker to develop." [citation needed] (Python doesn't even allow you to predeclare variables. One typo during an assignment, and you have a very hard-to-find bug.)

jrockway 2009-12-31 13:00:34

I don't have benchmarks. Perl was originally written for text processing and a lot of the design decisions and optimisations were made specifically for that. I expect it to be faster and I think that's the case. As for maintainability, the "one way and preferably only one way" methodology tends to have a uniforming effect on code which helps readability much more than TMTOWTDI.

Noufal Ibrahim 2009-12-31 14:17:55

"I expect it to be faster and I think that's the case." Well, never mind providing evidence, then! Expectations and hunches are almost never proven wrong.

Chris B. 2009-12-31 15:37:15

You're right. I'll do some quick benchmarks with arbitrary text and post the results. Let's see if my hunch is valid. My comment on the readability still stands though.

Noufal Ibrahim 2009-12-31 15:54:08

If it helps any - the language shootout usually shows python taking twice the average time of perl on regex tests. http://shootout.alioth.debian.org/u32q/python.php

mozillalives 2009-12-31 16:30:15

regex is not needed with your Python example. Use the "in" operator.

ghostdog74 2009-12-31 16:52:26

Python is only more readable if you're Perl-illiterate. ;)

fennec 2009-12-31 17:56:50

@ghostdog74. I wanted to use regexps in both places. Assume a pattern rather than a fixed string.

Noufal Ibrahim 2009-12-31 18:25:21

@mozillalives Did you notice that the Perl program used quad-core and the Python program didn't? Look at regex here - http://shootout.alioth.debian.org/u32/python.php#faster-programs-measurements

igouy 2009-12-31 19:18:56

Good grief, what a clamor in the comments. You Perl fans, okay, you love Perl, we get it. I love Python, and I do not like Perl, but I am not going to snipe at favorable comments on Perl. If Perl is what you like, use that, and especially use Perl for things that it is really good at. IMHO, writing large complex systems that need to be maintained is *not* one of the things Perl is good at, but grepping a large file *is*.

steveha 2009-12-31 22:04:38

on many platforms, older perl's would have done better; starting in 5.10, usefaststdio (formerly the default on many platforms - which causes perl to try to interact in encapsulation-breaking ways with libc's FILE structs in the interest of blinding speed) defaults to off and the slower perlio abstraction layer replaces use of stdio.

ysth 2010-01-01 01:10:16

@steveha: I'm not seeing any comments that I'd label as sniping, except at oversimplified "benchmarks", regardless of language.

ysth 2010-01-01 01:11:49

Answer 2

+6 A:

Python runs very fast, if you use the correct syntax of the python language. It is roughly described as "pythonic".

If you restructure your code like this, it will run at least twice as fast (well, it does on my machine):

j = 0
for i in range(10000000):
    j = j + 1
print j

Whenever you use a while in python, you should check if you could also use a "for X in range()".

FlorianH 2009-12-31 10:42:57

The speed of that will heavily depend on whether you are on Python 2 or Python 3.

Mark Byers 2009-12-31 10:44:21

Wow, that runs over 10 times faster than the OP's code on my machine!

quamrana 2009-12-31 10:49:27

You are right: If you are on Python 2, you should use xrange, I guess. A quick run on my laptop took 27.7 seconds with "range" and 21.4 seconds with "xrange". That is no proper benchmark though.

FlorianH 2009-12-31 10:50:35

Replacing `range` with `xrange` should help a little more and I think `j+=1` will boost it a little more.

Noufal Ibrahim 2009-12-31 10:51:22

And do you consider this."using the 'right' syntax" a normal behavior for a 'programming' language ?

Andrei Ciobanu 2009-12-31 12:32:17

@Nomemory: Are you saying that in programming languages you should able to use wrong syntax and still get valid or optimal results? Sounds like magic. And in any language (that I know) using a while loop in a situation where a for loop is called for is considered wrong. Or at least awkward.

Tofystedeth 2009-12-31 14:52:51

Typically, using a screwdriver for screws and a hammer for nails is a normal behavior for a tool user.

ΤΖΩΤΖΙΟΥ 2010-01-19 23:54:13

Answer 3

+2 A:

python is slower then perl. It may be faster to develop but it doesnt execute faster here is one benchmark http://xodian.net/serendipity/index.php?/archives/27-Benchmark-PHP-vs.-Python-vs.-Perl-vs.-Ruby.html -edit- a terrible benchmark but it is at least a real benchmark with numbers and not some guess. To bad theres no source or test other then a loop.

acidzombie24 2009-12-31 10:47:54

don't believe such benchmarks.

ghostdog74 2009-12-31 11:42:15

You can't very well demand benchmarks in a different comment, then dismiss them out of hand in this one. Perl is faster than Python for many tasks. Python's faster than Perl in many tasks as well. What's important is that they're both generally within the ballpark of one another, and close enough that larger decisions about algorithms and architecture will determine which program runs faster.

Chris B. 2009-12-31 15:41:18

The time taken to print "hello world" really isn't interesting!

igouy 2009-12-31 19:15:52

i know it isnt interesting but at least the loop was real and an example. I just said one benchmark not a good benchmark to look at. It shows at least one time and memory comparison. I never expected anyone to think python is faster then perl.

acidzombie24 2009-12-31 21:47:09

"at least the loop was real" - it isn't anymore "real" than the code snippet Foobarbaz posted in the original question.

igouy 2010-01-01 17:53:33

Answer 4

+6 A:

To OP, in Python this piece of code:

j = 0
for i in range(10000000):
    j = j + 1
print j

is the same as

print range(10000001)[-1]

which, on my machine,

$ time python test.py
10000000

real    0m1.138s
user    0m0.761s
sys     0m0.357s

runs for approximately 1s. range() (or xrange) is internal to Python and "internally" , it already can generates a sequence of numbers for you. Therefore, you don't have to create your own iterations using your own loop. Now, you go and find a Perl equivalent that can run for 1s to produce the same result

ghostdog74 2009-12-31 11:27:45

Mine was 100000000, with 8 zeroes, not 7.

Foobarbaz 2009-12-31 12:56:57

You are reducing the already toy example ad absurdum. Down this road, the perl equivalent would be "print 100000000;".

Beni Cherniavsky-Paskin 2009-12-31 13:03:14

@Foobarbaz, what i am illustrating to you is not how many zeroes make it slow or fast, but how to think of solving the problem to get the same result.

ghostdog74 2009-12-31 13:21:29

@Beni, I am sure you know using print 100000000 would be meaningless for this toy example. My example shows how the "looping" can be done in Python for the same result "without using loop", at least not explicitly

ghostdog74 2009-12-31 13:26:13

@Beni Cherniavsky-Paskin: Since `print 10000000` is the net effect of the perl example, that's a pretty solid indictment of this benchmark. Perhaps something from Project Euler would have been a better choice.

S.Lott 2009-12-31 16:29:21

Find a perl equivalent? Something like `foreach $i (1..10000000) { ... }`? Or `print [1..10000000]->[-1]`? Yeah, Perl's got number ranges too.

fennec 2009-12-31 17:59:10

Answer 5

+41 A:

The nit-picking answer is that you should compare it to idiomatic Python:

The original code takes 34 seconds on my machine.
A for loop (FlorianH's answer) with += and xrange() takes 21.
Putting the whole thing in a function reduces it to 9 seconds!
That's much faster than Perl (15 seconds on my machine)!
Explanation: Python local vars are much faster than globals.
(For fairness, I also tried a function in Perl - no change)
Getting rid of the j variable reduced it to 8 seconds:

print sum(1 for i in xrange(100000000))

Python has the strange property that higher-level shorter code tends to be fastest :-)

But the real answer is that your "micro-benchmark" is meaningless. The real question of language speed is: what's the performance of an average real application? To know that, you should take into account:

Typical mix of operations in complex code. Your code doesn't contain any data structures, function calls, or OOP operations.
Optimization opportunities: after you write your code, IF it's not fast enough, how much faster can you easily make it?

E.g. how hard is it to offload the heavy lifting to effecient C libriries?

If you want to talk number crunching, Python IS surprisingly popular with scientists. They love it for the simple pseudo-math syntax and short learning curve, but also for the excellent numpy library for array crunching and the ease of wrapping other existing C code.

And then there is the Psyco JIT which would probably run your toy example well under 1 second, but I can't check it now because it only works on 32-bit x86.

Beni Cherniavsky-Paskin 2009-12-31 11:38:44

+1: that your "micro-benchmark" is meaningless -- and reveals more about the author of the benchmark than about the languages being benchmarked.

S.Lott 2009-12-31 12:51:14

+1: Absolutely agree on the micro-benchmark. Compare execution times must be done on equivalent programs, which is much difficult that it sounds.

Khelben 2009-12-31 13:35:14

+1 for mentioning Psyco - this is something the OP really should try for those benchmarks, here is the missing link: http://psyco.sourceforge.net/

Doc Brown 2009-12-31 13:40:02

"Python local vars are much faster than globals." Interesting. Do you have a good reference to explain this? How about class instance variables—are they faster than globals too?

Craig McQueen 2010-01-01 00:18:03

@Craig: I've put some more reference about that in my answer.

Roberto Liffredo 2010-01-01 18:36:42

I count it as a shortcoming in python that its bizarre way of bringing variables into existence doesn't allow you a "local variable" without a function :)

hobbs 2010-01-02 21:41:16

@hobbs: Aside from perfmance, why do you want local vars at top level? Namespace => del statement; Object lifetime (RAII) => not coupled to scope in Python, to assure timely release use explicit .close() etc.

Beni Cherniavsky-Paskin 2010-01-06 15:05:48

Answer 6

+6 A:

All this micro benchmarking can get a bit silly!

For eg. just switching to for in both Python & Perl provides an hefty speed bump. The original Perl example would be twice as quick if for was used:

my $j = 0;

for my $i (1..100000000) {
    ++$j;
}

print $j;

And I can shave off a bit more with this:

++$j for 1..100000000;
print $j;

And getting even sillier we can get it down to 1 second here ;-)

print {STDOUT} (1..10000000)[-1];

/I3az/

ref: Perl 5.10.1 used.

draegtun 2009-12-31 11:59:55

i am interested in the last Perl version. What i get was around 5 or 6 secs.

ghostdog74 2009-12-31 12:07:22

@ghostdog74: My first two examples take 5-6 secs here so if you're seeing that with the last example then Perl and/or your machine is a lot slower than mine!

draegtun 2009-12-31 19:40:01

@ghostdog74: BTW I tried your two examples here. Last one is the quickest of all benchmarks (if xrange is used). However first example filled up memory and came in a whopping 16 times slower than Perl equivalent! Using xrange was much better, didn't fill up memory and came in just under 3 times slower than Perl version. Thing to learn... never trust pre-installed languages! (Python that comes with Mac OSX Tiger as always blown chunks!!)

draegtun 2009-12-31 19:56:11

The goal isn't to make the fastest program. It's to compare similar programs doing similar things. Of course the programs are faster if you do less work. That doesn't help you compare the parts you cut out though. :)

brian d foy 2010-01-01 11:25:42

Maybe, but it's kind of meaningless to speak of "similar programs doing similar things" on a microbenchmark. (Hey look my abacus is 1000 times faster than Perl at adding one to a number: I just flick a bead with my finger, and you have to type "++$j;", save it to a file, run the file, etc.)

Ken 2010-01-02 01:07:31

Answer 7

A:

is Python really that much slower than Perl?

Look at the Computer Language Benchmarks Game - "Compare the performance of ≈30 programming languages using ≈12 flawed benchmarks and ≈1100 programs".

They are only tiny benchmark programs but they still do a lot more than the code snippet you have timed -

http://shootout.alioth.debian.org/u32/python.php

igouy 2009-12-31 19:26:28

Answer 8

+1 A:

Python maintains global variables in a dictionary. Therefore, each time there is an assignment, the interpreter performs a lookup on the module dictionary, that is somewhat expensive, and this is the reason why you found your example so slower.

In order to improve performance, you should use local allocation, like creating a function. Python interpreter stores local variables in an array, with a much faster access.
However, it should be noted that this is an implementation detail of CPython; I suspect IronPython, for instance, would lead to a completely different result.

Finally, for more information on this topic, I suggest you an interesting essay from GvR, about optimization in Python: Python Patterns - An Optimization Anecdote.

Roberto Liffredo 2009-12-31 22:10:22

I'd be curious to see the same benchmark run with Perl using package variables instead.

brian d foy 2010-01-01 10:26:12

Answer 9

+1 A:

I'm not up-to-date on everything with Python, but my first idea about this benchmark was the difference between Perl and Python numbers. In Perl, we have numbers. They aren't objects, and their precision is limited to the sizes imposed by the architecture. In Python, we have objects with arbitrary precision. For small numbers (those that fit in 32-bit), I'd expect Perl to be faster. If we go over the integer size of the architecture, the Perl script won't even work without some modification.

I see similar results for the original benchmark on my MacBook Air (32-bit) using a Perl 5.10.1 that I compiled myself and the Python 2.5.1 that came with Leopard:

However, I added arbitrary precision to the Perl program with the bignum

 use bignum;

Now I wonder if the Perl version is ever going to finish. :) I'll post some results when it finishes, but it looks like it's going to be an order of magnitude difference.

Some of you may have seen my question about What are five things you hate about your favorite language?. Perl's default numbers is one of the things that I hate. I should never have to think about it and it shouldn't be slow. In Perl, I lose on both. Note, however, that if I needed numeric processing in Perl, I could use PDL.

brian d foy 2010-01-01 11:14:15

ansaurus

tags:

views:

answers:

Why is my Python version slower than my Perl version?

Update

related questions