views:

1426

answers:

8

Why does Python seem slower, on average, than C/C++? I learned Python as my first programming language, but I've only just started with C and already I feel I can see a clear difference.

A: 

C and C++ compile to native code- that is, they run directly on the CPU. Python is an interpreted language, which means that the Python code you write must go through many, many stages of abstraction before it can become executable machine code.

DeadMG
+8  A: 

The difference between python and C is the usual difference between an interpreted (bytecode) and compiled (to native) language. Personally, I don't really see python as slow, it manages just fine. If you try to use it outside of its realm, of course, it will be slower. But for that, you can write C extensions for python, which puts time-critical algorithms in native code, making it way faster.

Femaref
s/it's/its. Interpreted vs compiled means **nothing** in terms of optimizability. JVM and C can be either interpreted or compiled. Different optimizations can be applied in either case (adaptive optimization vs compile time + LTO)
Longpoke
Python is compiled.
Paul Hankin
python compiles to bytecode, which then is interpreted. it can also be compiled to machine code, so in essence, neither of us is right.
Femaref
In addition to being not-exactly-true, this answer doesn't talk about the real problem, which @Longpoke explains reasonably well in his answer.
SamB
+28  A: 

Python is a higher level language than C, which means it abstracts the details of the computer from you - memory management, pointers, etc, and allows you to write programs in a way which is closer to how humans think.

It is true that C code usually runs 10 to 100 times faster than Python code if you measure only the execution time. However if you also include the development time Python often beats C. For many projects the development time is far more critical than the run time performance. Longer development time converts directly into extra costs, fewer features and slower time to market.

Internally the reason that Python code executes more slowly is because code is interpreted at runtime instead of being compiled to native code at compile time.

Other interpreted languages such as Java bytecode and .NET bytecode run faster than Python because the standard distributions include a JIT compiler that compiles bytecode to native code at runtime. The reason why CPython doesn't have a JIT compiler already is because the dynamic nature of Python makes it difficult to write one. There is work in progress to write a faster Python runtime so you should expect the performance gap to be reduced in the future, but it will probably be a while before the standard Python distribution includes a powerful JIT compiler.

Mark Byers
Python *is* compiled.
Paul Hankin
To be pedantic: Python is not typically compiled to *native* code at compile time. Python bytecode still must be interpreted.
Mark Byers
You haven't really explained why Python implementations tend to be so CPU-hungry. You can abstract all of the above without incurring all that much cost at runtime; it's the extremely dynamic nature of Python that eats all the CPU: all of those attribute lookups/method dispatches add up, and give even JITs a fairly hard time -- and Python is usually used without a JIT at the moment.
SamB
@SamB: I've now added a comparison to other interpeted languages to address your point. The part I wrote about abstactions was not to explain why Python is slower to run, but to explain why it can be faster to program.
Mark Byers
A: 

Other than the answers already posted, one thing is pythons ability to change things in runtime that you can't change in for example C. You can add member functions to classes as you go. Also, pythons dynamic nature makes it impossible to say what type of parameters will be passed to a function, which in turn makes optimizing a whole lot harder.

RPython seems to be a way of getting around the optimization problem.

Still, it'll probably won't be near the performance of C for numbercrunching and the like.

Mattias Nilsson
Why shouldn't RPython perform reasonably? Doesn't it translate fairly directly to C?
SamB
Apparently I'm not keeping myself up-to-date. There's even a benchmark out there where RPython beats gcc. The future is here already :)
Mattias Nilsson
C doesn't have classes. Did you mean c++?
Roman A. Taycher
+25  A: 

CPython is particularly slow because it has no Just in Time optimizer (since it's the reference implementation and chooses simplicity over performance in certain cases). Unladen Swallow is a project to add an LLVM-backed JIT into CPython, and achieves massive speedups. It's possible that Jython and IronPython are much faster than CPython as well as they are backed by heavily optimized virtual machines (JVM and .NET CLR).

One thing that will arguably leave Python slower however, is that it's dynamically typed, and there is tons of lookup for each attribute access.

For instance calling f on an object A will cause possible lookups in __dict__, calls to __getattr__, etc, then finally call __call__ on the callable object f.

With respect to dynamic typing, there are many optimizations that can be done if you know what type of data you are dealing with. For example in Java or C, if you have a straight array of integers you want to sum, the final assembly code can be as simple as fetching the value at the index i, adding it to the accumulator, and then incrementing i.

In Python, this is very hard to make code this optimal. Say you have a list subclass object of ints. Before even adding any, Python must call list.__getitem__(i), then add that to the "accumulator" by calling list.__add__(other), then repeat. Tons of alternative lookups can happen here because another thread may have altered for example the __getitem__ method, the dict of the list instance, or the dict of the class, between calls to add or getitem. Not to mention the incrementing of i itself leads to another lookup of __add__, although this is somewhat mitigated for built-in types.

It's also worth noting, that the primitive types such as bigint (int in Python 3, long in Python 2.x), list, set, dict, etc, etc, are what people use a lot in Python. There are tons of built in operations on these objects that are already optimized enough. For example, for the example above, you'd just call sum(list) instead of using an accumulator and index. Sticking to these, and a bit of number crunching with int/float/complex, you will generally not have speed issues, and if you do, there is probably a small time critical unit (a SHA2 digest function, for example) that you can simply move out to C (or Java code, in Jython). The fact is, that when you code C or C++, you are going to waste lots of time doing things that you can do in a few seconds/lines of Python code. I'd say the tradeoff is always worth it except for cases where you are doing something like embedded or real time programming and can't afford it.

Longpoke
Doesn't Unladen Swallow currently use slightly more memory? 2009 Q2 [http://code.google.com/p/unladen-swallow/wiki/Release2009Q2] results say memory increased by 10x, and 2009 Q3 [http://code.google.com/p/unladen-swallow/wiki/Release2009Q3] says they got it down by 930% (not sure how to interpret that number). It sounds like lower memory is a goal, but not achieved yet.
Brendan Long
doh, that sentence I wrote didn't even make sense anyways.
Longpoke
+3  A: 

Comparing C/C++ to Python is not a fair comparison. Like comparing a F1 race car with a utility truck.

What is surprising is how fast Python is in comparison to its peers of other dynamic languages. While the methodology is often considered flawed, look at The Computer Language Benchmark Game to see relative language speed on similar algorithms.

The comparison to Perl, Ruby, and C# are more 'fair'

drewk
I prefer to use the metaphor of a Lamborghini speeding to work 5 blocks away (non memory-safe languages) vs a street car obeying speed limits (memory-safe languages). :)
Longpoke
C# is statically typed btw, although it has optional dynamic types.
Longpoke
C seems more like a rocket car to me -- great if you want to go in a straight line and there isn't anything to crash into nearby, otherwise not so much!
SamB
@drewk - you don't seem to have even read the title correctly for that website.
igouy
@drewk - "often considered flawed" show some evidence for that innuendo.
igouy
@igouy The name "The Great Computer Language Shootout" was the name prior to Debian's sponsorship. You can still find it with Google with that name and others with a similar name like http://dada.perl.it/shootout/index.html
drewk
@igouy `often considered flawed` The site itself: http://shootout.alioth.debian.org/flawed-benchmarks.php. Each benchmark is only as good as the person who wrote it, because they are not all the same algorithm. A successful result is solely based on the output, and the language specific optimizations are only as good as the author in that language. You might have highly optimized C being compared to slapped together Perl. Not only is the language slower, the algorithm may be slower. That is interesting comparison, but flawed.
drewk
@drewk - "was the name prior to Debian's sponsorship" - Debian don't sponsor it. If you really want us to look at a website that hasn't been updated since 2001 then link to the wayback machine, otherwise please have the courtesy to use the name chosen for the website you have do link to.
igouy
@drewk - The remarks you list don't come from "The site itself". The generalities you list apply to EVERY comparison between languages based on comparing programs. (And we can see from the "interesting alternatives" that it's untrue to say "a successful result is solely based on the output").
igouy
@igouy: What do you mean "hasn't been updated since 2001?" Some of the languages in the benchmark did not even exist in 2001 (C#, F#) the Intel quad core reference machine did not exist in 2001. This site is regularly updated. Help me out here: What is your issue? I am not making any sweeping condemnation of the methodology. It is interesting, useful, but flawed in some ways. What is your issue with that? Look at the Mandlebrot benchmark in Pascal, C, Perl and Python. The C program is highly optimized for 4 cores; Perl and Pascal have no optimizations at all. Big surprise: C is faster
drewk
igouy
@drewk >> What is your issue? << My issue is you keep writing stuff that isn't correct.
igouy
@igouy: You are right: The Perl code there was updated May 2010 to multithreaded. But wait! You claimed `the website that hasn't been updated since 2001` but you point me to a page dated May, 2010. Some Perl authors must have successfully produced the required output faster than the last Perl version. O but whoops! You claim it is not based on output! What is it based on then other than the diff to the reference output that they have? Please, if you have >> constructive << comments on the post I did or the OP to increase the accuracy, please let me know.
drewk
@drewk The website that is active and frequently updated is "The Computer Language Benchmarks Game".
igouy
igouy
@drewk >> The Perl code there was updated May 2010 to multithreaded << No, it was last timed in May 2010, that multithreaded Perl mandelbrot was first timed in March 2009.
igouy
@igouy I did say "a successful result is solely based on the output" and stand behind that. You also correctly state >> this program produces the required output __but isn't included__ << in the benchmark result. If the interesting alternatives with a different time were included in the result, it would throw off the benchmark totals, correct? So what is a "successful result?" Being listed on the website or being listed and included in the language benchmark result? I use the later as "success" not the former.
drewk
@igouy After all that drama, I think your __constructive comment__ is that I put the wrong web site title --- happy to make that edit.
drewk
@drewk - after 9 days of quarrelling since I mentioned it, thank you for putting the correct web site title in your post ;-)
igouy
@drewk >> I use the later as "success" [being listed and included in the language benchmark result] << And because you sensibly think "success" means being included in the benchmark result, you are demonstrably wrong to say "a successful result is solely based on the output". If it was "solely based on the output" there would be no reason not to include that program in the benchmark result. [Do you realise I'm the guy who decides which programs to include and exclude?]
igouy
@igouy We are in screaming agreement: "success" is being listed __and__ being included in the benchmark. You brought up the "interesting alternatives" as potentially a "success" which I do not agree with.
drewk
@igouy There can only be one listing included in the benchmark result from each language class, correct? Only the fastest from each submission from each is included, correct? The listing is included only if correctly produces the output, correct? When I say "a successful result is based solely on output" I cannot see why that is at odds with a) producing the required output, b) being including in the benchmark if it is the fastest submitted. I cannot see how the "interesting alternative" would be anything other than something that does produce the required output but not the fastest.
drewk
@igouy `after 9 days of quarrelling since I mentioned it, thank you for putting the correct web site title in your post` Well the tone of your original comments certainly masked the content and could have been written in a more constructive way. If you are involved in that site, I do respect it. Also, I certainly did not mean to offend you in any way and apologize if I did. Other than the site title, I still think that my post is correct: The submission in each language is only as good as its author and the algorithm used. This is interesting, but not objectively true comparison of raw speed.
drewk
@drewk >> I cannot see how the "interesting alternative" would be anything other than something that does produce the required output but not the fastest << I pointed you to an example "thread-ring Java #6". That program produces the correct output. That program is 60% faster than the fastest included program. That program uses 1 thread BUT the benchmark says "create 503 linked threads".
igouy
@drewk >> The submission in each language is only as good as its author and the algorithm used. << Please point to some comparison between languages based on comparing programs where that isn't true.
igouy
@igouy >>Please point to some comparison between languages based on comparing programs where that isn't true<<   To paraphrase what Churchhill said about Democracy: it is the worst possible testing method except  for all the alternative. ;-)  I used to be involved in video card benchmarking, and first we had very specific benchmarks were fair comparisons -- until manufactures literally designed hardware and drivers to execute the benchmark well at the expense of overall performance. Humans like simple answers: this is faster than that. Some things escape attempts for objective simple answers.
drewk
@drewk >> Some things escape attempts for objective simple answers << "Some things" might - but the question is C and Python performance. Rather than constructive, I'd characterise your comment as vague and dismissive.
igouy
@igouy Sorry you took my comments as >> vague and dismissive << I certainly did not mean them that way at all. Certainly not dismissive and it is hard to be precise in 600 characters. Once again -- really -- I am not trying to offend you, be dismissive, etc. All I was pointing out is that each language's performance in the benchmark is only as good as the author. I cannot see the controversy in that.
drewk
@drewk >> only as good as the author << If you commented on specific inefficiencies in specific programs written by the same person, then that might be an appropriate summing-up. As it is, your comment is empty - it doesn't help us understand if the authors were not so good or whether the authors were uniformly outstanding and provided superb programs. Your comment leaves the vague impression the programs are not as good as they could be, without in any way showing that to be true - vague and dismissive. I'm not offended, I'm disappointed.
igouy
+6  A: 

Compilation vs interpretation isn't important here: Python is compiled, and it's a tiny part of the runtime cost for any non-trivial program.

The primary costs are: the lack of an integer type which corresponds to native integers (making all integer operations vastly more expensive), the lack of static typing (which makes resolution of methods more difficult, and means that the types of values must be checked at runtime), and the lack of unboxed values (which reduce memory usage, and can avoid a level of indirection).

Not that any of these things aren't possible or can't be made more efficient in Python, but the choice has been made to favor programmer convenience and flexibility, and language cleanness over runtime speed. Some of these costs may be overcome by clever JIT compilation, but the benefits Python provides will always come at some cost.

Paul Hankin
+1  A: 

Python is typically implemented as a scripting language. That means it goes through an interpreter which means it translates code on the fly to the machine language rather than having the executable all in machine language from the beginning. As a result, it has to pay the cost of translating code in addition to executing it. This is true even of CPython even though it compiles to bytecode which is closer to the machine language and therefore can be translated faster. With Python also comes some very useful runtime features like dynamic typing, but such things typically cannot be implemented even on the most efficient implementations without heavy runtime costs.

If you are doing very processor-intensive work like writing shaders, it's not uncommon for Python to be somewhere around 200 times slower than C++. If you use CPython, that time can be cut in half but it's still nowhere near as fast. With all those runtmie goodies comes a price. There are plenty of benchmarks to show this and here's a particularly good one. As admitted on the front page, the benchmarks are flawed. They are all submitted by users trying their best to write efficient code in the language of their choice, but it gives you a good general idea.

I recommend you try mixing the two together if you are concerned about efficiency: then you can get the best of both worlds. I'm primarily a C++ programmer but I think a lot of people tend to code too much of the mundane, high-level code in C++ when it's just a nuisance to do so (compile times as just one example). Mixing a scripting language with an efficient language like C/C++ which is closer to the metal is really the way to go to balance programmer efficiency (productivity) with processing efficiency.

By your definition, Java is a scripting language too? Both languages have some sort of byte-code that is executed in a virtual machine. Only difference is, python compiles on-the-fly when needed, which Java normally doesn't do.
Mattias Nilsson
@Mattias: No; while you are correct that both use bytecode, Java compiles bytecode into native machine language (either in advance or with a JIT compiler) prior to execution. In some cases, Java bytecode is the native machine language of certain microprocessors. CPython, on the other hand, is a strict bytecode *interpreter*. It does all that translation work on the fly, which is why it is often about twice as fast as other Python implementations but still not nearly as fast as Java.
Also CPython does not compile-on-the-fly so much as interpret bytecode on the fly. Typical implementations are going to re-translate the same bytecode over and over, like if you have a loop. That's why it's still considered a bytecode interpreter rather than a compiler. CPython does *compile* .py files to .pyc on the fly (Python to bytecode), but that's a totally different thing. pyc is just code that's easier for the interpreter to read and translate, but it's still interpreted. The Java approach is more of a hybrid and it's not just because of the bytecode, but what it does with it.
Okay, we're getting into questions of how you define words. According to this, java is interpreted (or was) http://java.sun.com/docs/overviews/java/java-overview-1.htmlBut yeah, you have a point about python code not being translated to native machine code like Java. Unless of course, you use psyco with CPython, which generates machine code. Or unless you run Java byte code interpreted, which is also possible. That of course means you can't say something about a specific language without also specifying how the program is executed.
Mattias Nilsson
That's true, but there's also nothing to stop people from executing C code through an interpreter in which case someone could then say C is an interpreted language. For practical purposes, we describe languages as being interpreted or not based on typical implementations. From a strict language standpoint, languages are neither interpreted nor compiled.
igouy
@stinky472 >> As admitted on the front page, the benchmarks are flawed. << Have you read the "Flawed Benchmarks or Broken Thinking?" page?
igouy
@igouy Yes but if we get so pedantic, there's nothing that makes Python slower than C. If someone were to put enough effort into it, they might be able to come up with something comparable in performance, but that just typically doesn't happen. When you have a language that's dynamically typed with mechanisms that can only be implemented at runtime like introspection, there is going to be a runtime cost for it and even the best implementers have not been able to make this cost completely negligible. Maybe some day they'll find a revolutionary new way.
@igouy "As admitted on the front page, the benchmarks are flawed." No, you don't even have to look that far. Just read what I wrote in the answer where I pointed that out before posting the link. I'm also unsure as to where you are going with this. My point is that, for all practical purposes, if you were to, say, write a raytracer in existing Python implementations, no matter how good you are, you're not going to get comparable speeds to Pixar's Renderman.
@stinky472 >> Yes but if we get so pedantic << I was just providing a concrete example that substantiates your "there's also nothing to stop people from executing C code through an interpreter".
igouy
@stinky472 >> I'm also unsure as to where you are going with this << Perhaps what you mean by "flawed benchmarks" is different from the discussion on that webpage.
igouy
@igouy ah yes. Sorry, I thought maybe you were pointing out a problem with my answer but I wasn't sure exactly what it was.