Why python compile the source to bytecode before interpreting?
Why not interpret from the source directly?
Why python compile the source to bytecode before interpreting?
Why not interpret from the source directly?
Because interpretting from bytecode directly is faster. It avoids the need to do lexing, for one thing.
Because you can compile to a .pyc
once and interpret from it many times.
So if you're running a script many times you only have the overhead of parsing the source code once.
Nearly no interpreter really interprets code directly, line by line – it's simply too inefficient. Almost all interpreters use some intermediate representation which can be executed easily. Also, small optimizations can be performed on this intermediate code.
Python furthermore stores this code which has a huge advantage for the next time this code gets executed: Python doesn't have to parse the code anymore; parsing is the slowest part in the compile process. Thus, a bytecode representation reduces execution overhead quite substantially.
Re-lexing and parsing the source code over and over, rather than doing it just once (most often on the first import
), would obviously be a silly and pointless waste of effort.
I doubt very much that the reason is performance, albeit be it a nice side effect. I would say that it's only natural to think a VM built around some high-level assembly language would be more practical than to find and replace text in some source code string.
Edit:
Okay, clearly, who ever put a -1 vote on my post without leaving a reasonable comment to explain knows very little about virtual machines (run-time environments).
Although there is a small efficiency aspect to it (you can store the bytecode on disk or in memory), its mostly engineering: it allows you separate parsing from interpreting. Parsers can often be nasty creatures, full of edge-cases and having to conform to esoteric rules like using just the right amount of lookahead and resolving shift-reduce problems. By contrast, interpreting is really simple: its just a big switch statement using the bytecode's opcode.