views:

767

answers:

2

From the Google Open Source Blog:

PyPy is a reimplementation of Python in Python, using advanced techniques to try to attain better performance than CPython. Many years of hard work have finally paid off. Our speed results often beat CPython, ranging from being slightly slower, to speedups of up to 2x on real application code, to speedups of up to 10x on small benchmarks.

How is this possible? Which Python implementation was used to implement PyPy? CPython? And what are the chances of a PyPyPy or PyPyPyPy beating their score?

(On a related note... why would anyone try something like this?)

+7  A: 

PyPy is implemented in Python, but it implements a JIT compiler to generate native code on the fly.

The reason to implement PyPy on top of Python is probably that it is simply a very productive language, especially since the JIT compiler makes the host language's performance somewhat irrelevant.

Marcelo Cantos
Does the JIT generate Python code running at the same level as PyPy, or does it generate real native code running at the level of whichever Python implementation PyPy is running on?
Edmund
Real native code (see [here](http://pypy.org/download.html#with-a-jit-compiler)); 32-bit x86 code to be precise.
Marcelo Cantos
+18  A: 

Q1. How is this possible?

Manual memory management (which is what CPython does with it's counting) can be slower than automatic management in some cases.

Limitations in the implementation of the CPython interpreter preclude certain optimisations that PyPy can do (eg. fine grained locks).

As Marcelo mentioned, the JIT. Being able to on the fly confirm the type of an object can save you the need to do multiple pointer dereferences to finally arrive at the method you want to call.

Q2. Which Python implementation was used to implement PyPy?

The PyPy interpreter is implemnted in RPython which is a statically typed subset of Python (the language and not the CPython interpreter). - Refer http://codespeak.net/pypy/trunk/pypy/doc/architecture.html for details.

Q3. And what are the chances of a PyPyPy or PyPyPyPy beating their score?

That would depend on the implementation of these hypothetical interpreters. If one of them for example took the source, did some kind of analysis on it and converted it directly into tight target specific assembly code after running for a while, I imagine it would be quite faster than CPython.

Q4. why would anyone try something like this?

From the official site. http://codespeak.net/pypy/trunk/pypy/doc/architecture.html#id7

We aim to provide:

  • a common translation and support framework for producing implementations of dynamic languages, emphasising a clean separation between language specification and implementation aspects.
  • a compliant, flexible and fast implementation of the Python Language using the above framework to enable new advanced features without having to encode low level details into it.

By separating concerns in this way, we intend for our implementation of Python - and other dynamic languages - to become robust against almost all implementation decisions, including target platform, memory and threading models, optimizations applied, up to to the point of being able to automatically generate Just-in-Time compilers for dynamic languages.

Conversely, our implementation techniques, including the JIT compiler generator, should become robust against changes in the languages implemented.

The C compiler gcc is implemented in C, The Haskell compiler GHC is written in Haskell. Do you have any reason for the Python interpreter/compiler to not be written in Python?

Noufal Ibrahim