views:

577

answers:

6

Why python compile the source to bytecode before interpreting?

Why not interpret from the source directly?

+6  A: 

Because interpretting from bytecode directly is faster. It avoids the need to do lexing, for one thing.

Brian
+6  A: 

Because you can compile to a .pyc once and interpret from it many times.

So if you're running a script many times you only have the overhead of parsing the source code once.

Dave Webb
+23  A: 

Nearly no interpreter really interprets code directly, line by line – it's simply too inefficient. Almost all interpreters use some intermediate representation which can be executed easily. Also, small optimizations can be performed on this intermediate code.

Python furthermore stores this code which has a huge advantage for the next time this code gets executed: Python doesn't have to parse the code anymore; parsing is the slowest part in the compile process. Thus, a bytecode representation reduces execution overhead quite substantially.

Konrad Rudolph
Even the old MS BASIC on my TRS-80 used a very simple encoding scheme: as soon as I typed or edited a line, the BASIC keywords were collapsed into single bytes.
David Thornley
+4  A: 

Re-lexing and parsing the source code over and over, rather than doing it just once (most often on the first import), would obviously be a silly and pointless waste of effort.

Alex Martelli
A: 

I doubt very much that the reason is performance, albeit be it a nice side effect. I would say that it's only natural to think a VM built around some high-level assembly language would be more practical than to find and replace text in some source code string.

Edit:

Okay, clearly, who ever put a -1 vote on my post without leaving a reasonable comment to explain knows very little about virtual machines (run-time environments).

http://channel9.msdn.com/shows/Going+Deep/Expert-to-Expert-Erik-Meijer-and-Lars-Bak-Inside-V8-A-Javascript-Virtual-Machine/

John Leidegren
I didn't -1, but I will be honest that I didn't understand your point, especially this part of it: "would be more practical than to find and replace text in some source code string"
Joe Holloway
Watch the video on channel 9, or write your own VM, either two would probably explain the whole process in detail for you. What I meant by that quote is that it's sometimes easier to perform optimizations on a higher-level of abstraction than assembly. When you do this normally you work with an AST (abstract syntax tree), if you don't have one you can still perform the same optimizations but they have to move source code around, i.e. find and replace and that's really impractical, execution environments tend to go with other intermediate representation for this reason (see three-address code).
John Leidegren
The byte-code is just a more efficent compact and practical representation of the AST.
John Leidegren
+1  A: 

Although there is a small efficiency aspect to it (you can store the bytecode on disk or in memory), its mostly engineering: it allows you separate parsing from interpreting. Parsers can often be nasty creatures, full of edge-cases and having to conform to esoteric rules like using just the right amount of lookahead and resolving shift-reduce problems. By contrast, interpreting is really simple: its just a big switch statement using the bytecode's opcode.

Paul Biggar