views:

406

answers:

7

I'm trying to get a better understanding of the difference. I've found a lot of explanations online, but they tend towards the abstract differences rather than the practical implications.

Most of my programming experiences has been with CPython (dynamic, interpreted), and Java (static, compiled). However, I understand that there are other kinds of interpreted and compiled languages. Aside from the fact that executable files can be distributed from programs written in compiled languages, are there any advantages/disadvantages to each type? Oftentimes, I hear people arguing that interpreted languages can be used interactively, but I believe that compiled languages can have interactive implementations as well, correct?

+1  A: 

First, a clarification, Java is not fully static-compiled and linked in the way C++. It is compiled into bytecode, which is then interpreted by a JVM. The JVM can go and do just-in-time compilation to the native machine language, but doesn't have to do it.

More to the point: I think interactivity is the main practical difference. Since everything is interpreted, you can take a small excerpt of code, parse and run it against the current state of the environment. Thus, if you had already executed code that initialized a variable, you would have access to that variable, etc. It really lends itself way to things like the functional style.

Interpretation, however, costs a lot, especially when you have a large system with a lot of references and context. By definition, it is wasteful because identical code may have to be interpreted and optimized twice (although most runtimes have some caching and optimizations for that). Still, you pay a runtime cost and often need a runtime environment. You are also less likely to see complex interprocedural optimizations because at present their performance is not sufficiently interactive.

Therefore, for large systems that are not going to change much, and for certain languages, it makes more sense to precompile and prelink everything, do all the optimizations that you can do. This ends up with a very lean runtime that is already optimized for the target machine.

As for generating executbles, that has little to do with it, IMHO. You can often create an executable from a compiled language. But you can also create an executable from an interpreted language, except that the interpreter and runtime is already packaged in the exectuable and hidden from you. This means that you generally still pay the runtime costs (although I am sure that for some language there are ways to translate everything to a tree executable).

I disagree that all languages could be made interactive. Certain languages, like C, are so tied to the machine and the entire link structure that I'm not sure you can build a meaningful fully-fledged interactive version

Uri
C is not really tied to a "machine". The syntax and semantics of C are rather simple. It shouldn't be particularly difficult to implement a C-interpreter, only very time-consuming (because the standard library has to be implemented as well). And btw, Java can be compiled into native machine code (using gcj).
lunaryorn
@lunaryorn: I disagree about GCJ. GCJ merely gives you an executable based environment. "Compiled applications are linked with the GCJ runtime, libgcj, which provides the core class libraries, a garbage collector, and a bytecode interpreter"
Uri
@Uri: GCJ *does* produce native machine code, and not just an executable environment with embedded interpreter and bytecode. libgcj provides a bytecode interpreter to support calls from native code into Java bytecode, not to interpret the compiled program. If libgcj did not provide a bytecode interpreter, GCJ would not comply to the Java spec.
lunaryorn
@lunaryorn: Ah. Ok, I appreciate the clarification and stand corrected. We primarily use Java in a windows environment so I haven't tried gcj in years.
Uri
[You](http://www.softintegration.com/) [can](http://root.cern.ch/drupal/content/cint) [interpret](http://www.artificialworlds.net/wiki/IGCC/IGCC) [C](http://www.programmersheaven.com/download/16526/download.aspx).
Roger Pate
A: 

It's rather difficult to give a practical answer because the difference is about the language definition itself. It's possible to build an interpreter for every compiled language, but it's not possible to build an compiler for every interpreted language. It's very much about the formal definition of a language. So that theoretical informatics stuff noboby likes at university.

Steven Mohr
Surely you can build a compiler for an interpreted language, but the compiled machine-code is itself a mirror of the runtime.
Aiden Bell
+2  A: 

The extreme and simple cases:

  • A compiler will produce a binary executable in the target machine's native executable format. This binary file contains all required resources except for system libraries; it's ready to run with no further preparation and processing and it runs like lightning because the code is the native code for the CPU on the target machine.

  • An interpreter will present the user with a prompt in a loop where he can enter statements or code, and upon hitting RUN or the equivalent the interpreter will examine, scan, parse and interpretatively execute each line until the program runs to a stopping point or an error. Because each line is treated on its own and the interpreter doesn't "learn" anything from having seen the line before, the effort of converting human-readable language to machine instructions is incurred every time for every line, so it's dog slow. On the bright side, the user can inspect and otherwise interact with his program in all kinds of ways: Changing variables, changing code, running in trace or debug modes... whatever.

With those out of the way, let me explain that life ain't so simple any more. For instance,

  • Many interpreters will pre-compile the code they're given so the translation step doesn't have to be repeated again and again.
  • Some compilers compile not to CPU-specific machine instructions but to bytecode, a kind of artificial machine code for a ficticious machine. This makes the compiled program a bit more portable, but requires a bytecode interpreter on every target system.
  • The bytecode interpreters (I'm looking at Java here) recently tend to re-compile the bytecode they get for the CPU of the target section just before execution (called JIT). To save time, this is often only done for code that runs often (hotspots).
  • Some systems that look and act like interpreters (Clojure, for instance) compile any code they get, immediately, but allow interactive access to the program's environment. That's basically the convenience of interpreters with the speed of binary compilation.
  • Some compilers don't really compile, they just pre-digest and compress code. I heard a while back that's how Perl works. So sometimes the compiler is just doing a bit of the work and most of it is still interpretation.

In the end, these days, interpreting vs. compiling is a trade-off, with time spent (once) compiling often being rewarded by better runtime performance, but an interpretative environment giving more opportunities for interaction. Compiling vs. interpreting is mostly a matter of how the work of "understanding" the program is divided up between different processes, and the line is a bit blurry these days as languages and products try to offer the best of both worlds.

Carl Smotricz
+10  A: 

A compiled language is one where the program, once compiled, is expressed in the instructions of the target machine. For example, an addition "+" operation in your source code could be translated directly to the "ADD" instruction in machine code.

An interpreted language is one where the instructions are not directly executed by the the target machine, but instead read and executed by some other other program (which normally is written in the language of the native machine). For example, the same "+" operation would be recognised by the interpreter at run time, which would then call its own "add(a,b)" function with the appropriate arguments, which would then execute the machine code "ADD" instruction.

You can do anything that you can do in an interpreted language in a compiled language and vice-versa - they are both Turing complete. Both however have advantages and disadvantages for implementation and use.

I'm going to completely generalise (purists forgive me!) but roughly here are the advantages of compiled languages:

  • Faster performance by directly using the native code of the target machine
  • Opportunity to apply quite powerful optimisations during the compile stage

And here are the advantages of interpreted languages:

  • Easier to implement (writing good compilers is very hard!!)
  • No need to run a compilation stage: can execute code directly "on the fly"
  • Can be more convenient for dynamic languages

Note that modern techniques such as bytecode compilation add some extra complexity - what happens here is that the compiler targets a "virtual machine" which is not the same as the underlying hardware. These virtual machine instructions can then be compiled again at a later stage to get native code (e.g. as done by the Java JVM JIT compiler).

mikera
+1 Good answer :)
Aiden Bell
Not all compiled languages need a slow compilation stage. Serious Common Lisp implementations are compilers, and they often don't bother with an interpreter, preferring to just compile real fast on the fly. On the other hand, Java does need a compilation step, and it usually is visible.
David Thornley
@David completely agree! As I said I was generalising somewhat :-) I use Clojure quite a bit, which compiles on the fly (to the JVM) in exactly the same way.
mikera
+1  A: 

A compiler and an interpreter do the same job: translating a programming language to another pgoramming language, usually closer to the hardware, often direct executable machine code.

Traditionally, "compiled" means that this translation happens all in one go, is done by a developer, and the resulting executable is distributed to users. Pure example: C++. Compilation usually takes pretty long and tries to do lots of expensive optmization so that the resulting executable runs faster. End users don't have the tools and knowledge to compile stuff themselves, and the executable often has to run on a variety of hardware, so you can't do many hardware-specific optimizations. During development, the separate compilation step means a longer feedback cycle.

Traditionally, "interpreted" means that the translation happens "on the fly", when the user wants to run the program. Pure example: vanilla PHP. A naive interpreter has to parse and translate every piece of code every time it runs, which makes it very slow. It can't do complex, costly optimizations because they'd take longer than the time saved in execution. But it can fully use the capabilities of the hardware it runs on. The lack of a separrate compilation step reduces feedback time during development.

But nowadays "compiled vs. interpreted" is not a black-or-white issue, there are shades in between. Naive, simple interpreters are pretty much extinct. Many languages use a two-step process where the high-level code is translated to a platform-independant bytecode (which is much faster to interpret). Then you have "just in time compilers" which compile code at most once per program run, sometimes cache results, and even intelligently decide to interpret code that's run rarely, and do powerful optimizations for code that runs a lot. During development, debuggers are capable of switching code inside a running program even for traditionally compiled languages.

Michael Borgwardt
However, C++'s compilation model is inherited from C and was designed without consideration of features such as templates. This awkwardness contributes to C++'s long compile times much more than any other factor – and makes it a poor example.
Roger Pate
+4  A: 

A language itself is neither compiled nor interpreted, only a specific implementation of a language is. Java is a perfect example. There is a bytecode-based platform (the JVM), a native compiler (gcj) and an interpeter for a superset of Java (bsh). So what is Java now? Bytecode-compiled, native-compiled or interpreted?

Other languages, which are compiled as well as interpreted, are Scala, Haskell or Ocaml. Each of these languages has an interactive interpreter, as well as a compiler to byte-code or native machine code.

So generally categorizing languages by "compiled" and "interpreted" doesn't make much sense.

lunaryorn
I agree. Or let's say: There are native compilers (creating machine code for the CPU to eat), and not-so-native-compilers (creating tokenized stuff, i.e. intermediate code, that some just-in-time compiler compiles to machine code before (or during) runtime ONCE), and there are "real" non-compilers that never produce machine code and never let the CPU run the code. The latter are interpreters.Today, native compilers which directly produce machine (CPU) code at compile-time are becoming more and more rare. Delphi/Codegear is one of the best survivors.
TheBlastOne
A: 

Start thinking in terms of a: blast from the past

Once upon a time, long long ago, there lived in the land of computing interpreters and compilers. All kinds of fuss ensued over the merits of one over the other. The general opinion at that time was something along the lines of:

  • Interpreter: Fast to develop (edit and run). Slow to execute because each statement had to be interpreted into machine code every time it was executed (think of what this meant for a loop executed thousands of times).
  • Compiler: Slow to develop (edit, compile, link and run. The compile/link steps could take serious time). Fast to execute. The whole program was already in native machine code.

A one or two order of magnitude difference in the runtime performance existed between an interpreted program and a compiled program. Other distinguishing points, run-time mutability of the code for example, were also of some interest but the major distinction revolved around the run-time performance issues.

Today the landscape has evolved to such an extent that the compiled/interpreted distinction is pretty much irrelevant. Many compiled languages call upon run-time services that are not completely machine code based. Also, most interpreted languages are "compiled" into byte-code before execution. Byte-code interpreters can be very efficient and rival some compiler generated code from an execution speed point of view.

The classic difference is that compilers generated native machine code, interpreters read source code and generated machine code on the fly using some sort of run-time system. Today there are very few classic interpreters left - almost all of them compile into byte-code (or some other semi-compiled state) which then runs on a virtual "machine".

NealB