views:

331

answers:

6

The following code in python takes very long to run. (I couldn't wait until the program ended, though my friend told me for him it took 20 minutes.)

But the equivalent code in Java runs in approximately 8 seconds and in C it takes 45 seconds.

I expected Python to be slow but not this much, and in case of C which I expected to be faster than Java was actually slower. Is the JVM using some loop unrolling technique to achieve this speed? Is there any reason for Python being so slow?

import time
st=time.time()
for i in xrange(0,100000):
    for j in xrange(0,100000):
        continue;
print "Time taken : ",time.time()-st
+4  A: 

Is there any reason for Python being so slow?

Yes.

But what does it matter? You've created 100,000 xrange objects. Why? What does that matter? What is your real question on performance? What algorithm do you actually have that's actually too slow?


for i in xrange(0,100000): # Creates one xrange object
    for j in xrange(0,100000): # Creates a fresh xrange object each time through the loop
S.Lott
@Emil: What does "u" mean? What does "i" mean? Please use English. Not all of us read whatever shorthand you're using.
S.Lott
+1  A: 

A good compiler would optimise away the loop.

Assuming the loop isn't optimised away, I'd expect Python to be something like 100 times slower than the C version

gnibbler
+9  A: 

Your test is not measuring anything meaningful.

A language's performance in the real world has little to do with how quickly it executes a tight loop.

Frankly, I'm intrigued that C and Java took as long as they did; I would have expected both of their compilers to realize that there was nothing happening inside the inner loop, and have optimized both of them away into nonexistence (and 0 seconds).

Python, on the other hand, is still interpreted (I could be wrong about this). In any case, it looks like the outer loop is needing to construct 100,000 xrange objects on which to run the empty inner loop, and that's unlikely to be optimized away.

So all you're really measuring is various compilers' ability to see through the fact that no real computing work is being done.

Carl Smotricz
+1: Python is generally interpreted. There are some Python-to-C tools that would lead to more C-like performance.
S.Lott
@Carl: C by default has optimizations disabled. Java has them by default enabled. E.g. GCC makes such loops a nop already on -O1.
Dummy00001
@CarlActually i was running a program in python to solve a puzzle when i saw how slow loops in python is.The prog took 100sec in python to show result while in C it was 1-2 sec.Thats the reason why i tried the above code in different languages.
Emil
@Emil: If you're solving Project Euler problems with nested loops like the one shown above, you're doing it wrong. Project Euler problems require careful design, not compiler optimizations.
S.Lott
+2  A: 
for i in xrange(0, 10000):
    for j in xrange(0, 10000):
        pass

or

for i in xrange(0, 100000000):
    pass

Python 2.6.5 - Time taken : 8.50866484642

PyPy 1.3 - Time taken : 1.55999398232

reason of slow work not in creation of xrange objects

nazca
so what do you think could be the problem?
Decio Lira
I think there is no problem, Python is "high-level programming language" like Ruby, php, or Perl. Python faster then php ... and if you want to use it in web - it's a right way, but if want use it to write OS - it's bad idea
nazca
for example, Ruby in same test 30 times slower...
nazca
@nazca: I found your test interesting but it would be more useful if you could quote times not just for two Python environments but for both tests! There should be 4 different times to see here. I find it hard to believe there would be NO difference between the single and nested loops.
Carl Smotricz
@Carl Smotricz: This time is the same for two tests... because the creation time of 10k xrange objects ~0,04s (on my machine) ...in this test difference between the single and nested loop is 0,5 percent
nazca
Interesting! Thanks.
Carl Smotricz
+6  A: 

The lesson is: Performance is never what you expect. Therefore, always measure, never believe.

Some reasons why you might see these numbers (and from the first sentence, some of these might be completely wrong):

C is compiled for an "i586" processor (also called Pentium). That CPU was sold from 1993 to about 2000. Have you seen one lately? Guess not. So the C code isn't really optimized for what your CPU can do (or to put it another way around: Today's CPUs try very hard to be a fast Pentium CPU). Java, OTOH, is compiled for your CPU as the code is loaded. It can pull some tricks that C simply can't. The price is that the C program starts in 15ms while the Java program needs 4 Seconds.

Python has no JIT (just in time compiler), yet. All code is converted into bytecode which is then interpreted. This means the loop above is turned into a dozen bytecode instructions which are then interpreted by a C program. That just takes time. Python is not meant for huge loops, it's meant for smart algorithms which you simply can't express in any other language (at least not with the same amount of code and readability).

So just as it doesn't make sense to go shopping with a 18t truck (you can transport anything but you won't find a space to park it), chose your programming language according to the problem you need to solve. It has to be small&fast? C. Just fast? Java. Flexible? Python. Flexible&Fast: Python with a helper library in C (like NumPy).

Aaron Digulla
That reminds me of the oft-quoted saying, "There are lies, damned lies, and statistics." In this case you could say, "There are lies, damned lies, and benchmarks." -- That's one thing I learned a few years ago - most benchmarks, unless they're measuring on a useful task, are almost useless.
Wayne Werner
Actually my C compiler (gcc 4.2 + llvm) compiles for a processor called x86_64 by default. The processor architecture thing is a red herring. The real difference between the C figure and the Java figure is - as others have said - different optimisation levels.
JeremyP
+2  A: 

gcc 4.2 with the -O1 flag or higher optimize away the loop and the program takes 1 milli second to execute. This benchmark is not very representative as it is very far from any real world use. You're doing a nested loop for a reason, and you never leave it empty. Python doesn't optimize away the loop, although I see no technical reason why it couldn't.

Python is slower than C because it's further from the machine language. xrange is a nice abstraction but it adds a heavy layer of machine code compared to a simple C loop.

C source:

int main( void ){
        int i, j;
        for (i=0;i<100000;i++){
                for (j=0;j<100000;j++){
                        continue;
                }
        }
        return 0;
}
Joss82
The technical reason that it doesn't is simply the fact that python programs are executed/compiled on a line by line basis. Of course when unladen swallow (http://code.google.com/p/unladen-swallow/) is migrated in, it's *probable* that the JIT compilation process will actually optimize that away.
Wayne Werner