Interpreted vs. Compiled vs. Late-Binding

views:

406

answers:

+9 Q:

Interpreted vs. Compiled vs. Late-Binding

Python is compiled into an intermediate bytecode(pyc) and then executed. So, there is a compilation followed by interpretation. However, long-time Python users say that Python is a "late-binding" language and that it should`nt be referred to as an interpreted language.

How would Python be different from another interpreted language?
Could you tell me what "late-binding" means, in the Python context?

Java is another language which first has source code compiled into bytecode and then interpreted into bytecode.

Is Java an interpreted/compiled language?
How is it different from Python in terms of compilation/execution?
Java is said to not have, "late-binding". Does this have anything to do with Java programs being slighly faster than Python?

It'd be great if you could also give me links to places where people have already discussed this; i'd love to read more on this. Thank you.

+6 A:

How would Python be different from another interpreted language?

That involves hair-splitting. Interpreted languages and "managed code" languages like C# and virtual machine languages (like Java) form a weird continuum. There are folks who will say that all languages are "interpreted" -- even machine language. After all, the electronic circuits of the CPU "interpret" machine language.

The best you can do is say that "interpreted" means there's a visible layer of software interpreting your application byte-codes. "not-interpreted" means that your software is (more-or-less) directly executed by the underlying hardware. "Managed code" people are free to continue to split this hair.

Could you tell me what "late-binding" means, in the Python context?

Variables are not declared to have a type. The variable is bound to a type as late as possible -- with the assignment of an actual object.

Is Java an interpreted/compiled language?

Yes. It's compiled to byte codes. The byte codes are interpreted. I prefer to call it interpreted.

However, people will (for really obscure reasons) disagree. The presence of any kind of "compile" step -- however minimal -- always confuses people. The translation to byte code has almost no relevance to the actual behavior of the program at run time. Some folks like to say that only languages that are totally free from any taint of pre-processing "compilation" can be interpreted. There aren't a lot of examples of this any more, since many languages are translated from human-friendly text to interpreter friendly byte codes. Even Applesoft Basic (back in the 80's) had this kind of translation pass done as you typed code in.

Some JVM's do JIT. Some don't. Some are a mixture. To say that the JVM only does JIT byte-code translation is incorrect. Some JVM's do. Some don't.

How is it different from Python in terms of compilation/execution?

Not at all. The Java VM can execute Python. [For the easily-confused, the word "python" in this context cannot possibly mean "python source". It must mean python bytecode.]

Java is said to not have, "late-binding". Does this have anything to do with Java programs being slighly faster than Python?

Perhaps. Java programs are often faster because of JIT compilers that translate Java byte code to machine code at run-time.

Static ("early") binding doesn't have the same kind of benefit for Java that it has with a truly compiled language like C or C++ where there are almost no run-time checks of any kind. Java still does things like array bounds checking, which C omits in the interest of raw speed.

There is actually little penalty for "late" binding. Python attributes and methods are resolved using simple dictionary lookups. The dictionary is a hash; performance is quite good. The hashes for names can be put into an "interned" string literal pool amortizing the cost of computing the hash.

For real fun, look PyPy and RPython. This is a Python interpreter that can do JIT compilation. You wind up with a 2-tier interpreter. Your code is interpreted by PyPy. PyPy is interpreted by RPython. http://alexgaynor.net/2010/may/15/pypy-future-python/

S.Lott 2010-05-21 11:22:25

Your Java comments are off the mark; there is no interpreter, which is surprising, since you correctly point out later that Java byte-code is compiled to machine code on the fly.

Marcelo Cantos 2010-05-21 11:26:02

The JVM is an interpreter. Please explain how the JVM is *not* an interpreter.

S.Lott 2010-05-21 11:27:23

The JVM cannot execute Python. It can only JIT-compile Java byte-code. The Jython project provides a compiler that translates a dialect of Python into Java byte-code.

Marcelo Cantos 2010-05-21 11:28:52

Java programs are also faster because the provable constraints set by static analysis at compile time allows Java to omit lots of runtime checks from the bytecode. This of course comes at a corresponding cost of expressiveness.

Ants Aasma 2010-05-21 11:30:07

"Please explain how the JVM is not an interpreter." The JVM doesn't execute bytecode at all. It runs it through a JIT-compilation stage, which produces machine code that runs directly on the hardware.

Marcelo Cantos 2010-05-21 11:31:17

@S.Lott: It's not that simple, the JVM is not just an interpreter. The JVM contains a JIT (Just-In-Time) compiler, that compiles bytecode to native machine code at runtime. Once some bytecode is compiled to native code, that's used each time the method is called and no interpretation needs to be done.

Jesper 2010-05-21 11:31:29

@Marcelo: HotSpot VM includes both an interpreter and a JIT compiler. And while we're nitpicking, Jython compiles to JVM bytecode, not Java byte-code.

Ants Aasma 2010-05-21 11:31:41

Fair point Ants. I didn't realise some enterprise JVMs still used interpretation. But it _is_ called [Java bytecode](http://en.wikipedia.org/wiki/Java_bytecode) (my only mistake there was the hyphen).

Marcelo Cantos 2010-05-21 11:39:00

@Jesper: It's not that simple, modern x86 chips have an on-chip interpreter that runs the x86 code on a considerably different CPU.

David Thornley 2010-05-21 16:27:23

@David Thormley: It's not that simple, since some CPUs have more than one conceptual level of execution buried in the digital logic. Can we not split this hair and simplify things by saying "hardware" is not "interpreted" just to make it possible to reason about "software" separately from "hardware"?

S.Lott 2010-05-21 16:55:54

@S. Lott: I agree with you, but unfortunately pedantic hair-splitting and the like brings out the hair-splitting pedant in me. Or perhaps we could consider the bottleneck principle: on my old TRS-80, the motherboard contained the BASIC interpreter, just as my modern motherboards contain the x86 interpreter. The difference is I could program the TRS-80 in Z80, while I can't program my current computers in anything lower than x86. Does it make sense to argue compiler vs. interpreter when you only have one choice of language?

David Thornley 2010-05-21 17:17:06

@David Thormley: That's where I draw the line, also. Python is "obvious" software doing interpretation. The x86 CPU hand-off, microcode and ASICs is far from obvious as a software developer. So I draw the line as you do at "what can I -- as a mere mortal and not an engineer back at the manufacturing plant -- program?" Python interpreter -- installed after the OS -- is clearly an interpreter I control. Anything in OS (or lower) is clearly out of my control.

S.Lott 2010-05-21 18:00:19

+4 A:

Late binding is a very different concept to interpretation.

Strictly speaking, an interpreted language is executed directly from source. It doesn't go through a byte-code compilation stage. The confusion arises because the python program is an interpreter, but it interprets the byte-code, so it is Python's byte-code language that you would describe as "interpreted". The Python language itself is a compiled language.

Java bytecode, in contrast, is both interpreted and compiled, these days. It is compiled into native code by a JIT-compiler and then run directly on the hardware.

Late binding is a property of the type system and is present in most languages to some degree, regardless of whether they are interpreted or compiled.

Marcelo Cantos 2010-05-21 11:24:29

Java is very much interpreted these days. The Hotspot JVM runs all code via an interpreter initially and JIT-compiles only the most frequently executed parts.

Michael Borgwardt 2010-05-21 11:31:23

Android doesn't strictly count as being a Java byte-code interpreter. It has it's own byte-code target that it interpret.

Ants Aasma 2010-05-21 11:35:30

Thanks Michael. Ants pointed this out to me elsewhere. I've amended the answer.

Marcelo Cantos 2010-05-21 11:41:12

+3 A:

There's a connection between what we call the binding time and the concept of interpretation/compilation.

The binding time is the time when a symbolic expression is bound to its concrete value. That's more related to the definition of programming language, e.g. dynamic vs. static scoping of variables. Or static method vs. virtual methods or dynamic typing vs. static typing.

Then comes the implementation of the language. The more information are statically known upfront, the easier it is to write a compiler. Inversely, the more late bound the language is, the harder it is. Hence the need to rely on interpretive techniques sometimes.

The distinction between both isn't strict however. Not only can we consider that everything is ultimately interpreted (see S.Lott answer), but part of the code can be compiled, decompile, or recompile dynamically (e.g. JIT) making the distinction very fuzzy.

For instance, dynamic class loading in Java goes in the category "late binding": the set of class is not fixed once for all, and classes can be loaded dynamically. Some optimizations can be done when we know the set of classes, but will need to be invalidated once a new classes is loaded. The same happens with the ability to update a method with the debugging infrastructure: the JVM will need to de-optimize all call sites were the method had been inlined.

I don't know much about Python, but Python practitioners prefer maybe the term "late bound" to avoid such confusion.

ewernli 2010-05-21 11:48:51

Great answer, explains not only the the distinction between interpretation/compilation and binding model but more importantly the connection that causes the confusion in the first place.

Ants Aasma 2010-05-21 13:47:32

binding time is when names get resolved to things. More dynamic languages tend towards late binding. This can be separate from interpretation/compilation -- for example, objective-C methods are resolved late and dynamically compared to C++. Java does much of it's binding at class load time : later than C but earlier than Python.

my favorite quote from Stan Kelly-Bootle's Computer Contradictionary:

binding time n. The moment when the hash table becomes corrupted.

==> Advances in computing can be mapped against the "lateness of binding," which has me thinking about my own so-called CS so-called career: golden past, gray present, and rosy future. This is my version of Synge's optimism: the grass is greener except at t=0. On EDSAC I, my functions (5ch paper-tape subroutines) were punched, spliced, and bound about two weeks before input. This is known aspremature binding and calls for deftness with elastic bands. FORTRAN came next with a new kind of binding: soggy decks of cards that refused to be shuffled. Then with Algol and C, I enjoyed static (compile-time) binding, until C++ brought the numbing joys of dynamic (run-time) binding. My current research aims at delaying the binding until well after execution. I call this end-time binding, as prophesied in St. Matthew's Gospel: "...and whatsoever thou shalt bind on earth shall be bound in heaven..." (Matthew 16:19 KJV).

Steven D. Majewski 2010-05-21 15:34:10

+2 A:

I think the common misconception that Python is interpreted while Java is compiled arises because Java has an explicit compilation step - you have to run javac to convert your .java source file into a .class bytecode file that can be run.

As you rightly point out Python similarly compiles source files into bytecode but it does it transparently - compiling and running is generally done in a single step so it is less obvious to the user.

The important difference is between early & late binding and dynamic & static typing. The compiled/interpreted distinction is meaningless and irrelevant.

Dave Kirby 2010-05-21 15:34:26

ansaurus

tags:

views:

answers:

Interpreted vs. Compiled vs. Late-Binding

related questions