views:

230

answers:

9

Which interpreted languages pointer-free languages (IE: Python, Java, Perl, PHP, Ruby, Javascript, etc) have manual memory management? I don't recall ever hearing of one.

Isn't the major concern about interpreted languages the non-deterministic delays (or space complexity when there isn't enough delay) of garbage collection? So why not just write something exactly like Java, but forces you free memory manually?

EDIT

What I mean by manual memory management is that the language would have references to objects, and you can delete the object using a reference.

Example:

Object a = new Object(); // a is a reference to the object
Object b = a; // b is a reference to the same object
a.method(); // fine
delete b; // delete the object referenced by b
a.method(); // null dereference exception

So what caveats (Other than memory leaks) could there be in a language like this example?

+1  A: 

The reason is circular references, null pointer exceptions and multiple references. A simple example:

var a = new Object();
var b = a;
a = null;//or delete a or free a or whatever;
print(b);//what now? is b null? or is b still new Object()?

If, in the example above, b is now null, you end up with some major problems when redefining variables. For example, instead of setting a to null, what if you set it to c? would b also be c?

You can read about other problems, like circular references, on wikipedia.

Marius
The circular reference problem can be solved with region-based memory management. Early manual approaches were arenas and stack-based memory management, such as Forth's FORGET.
Doug Currie
Yes null references would have been fine, however I didn't think about reference aliasing, which complicates things.
Longpoke
A: 

Interpreted does not necessarily imply garbage collected. Perl, Tcl, Python, etc. etc. I believe all use simple reference counting, so the memory reclamation is deterministic, though not at all transparent (ever tried strace on a Perl program?).

Nikolai N Fetissov
Reference counting is a type of garbage collection, often used despite its failings because it's simple to implement and doesn't depend on the platform.
David Thornley
Yes, but this is just terminology. I meant inline ref-counting vs. GC-in-the-background as in Java/C#.
Nikolai N Fetissov
Yes I know Python does reference counting. I am talking about explicit deletion of objects so that making two references to them and deleting through one reference would yield an exception when trying to access the object through either reference.
Longpoke
And isn't the time to discover and eliminate circular references nondeterministic?
Longpoke
+1  A: 

In some high performance interpreted languages like Lua, you can manually handle garbage collection. See lua_gc.

Kornel Kisielewicz
+3  A: 

Forth has stacked regions of memory that can be released with FORGET.

Doug Currie
+2  A: 

There are some C/C++ interpreters available, for example, this one.

Did not try it out by myself, but I think since it claims to be compatible to compiled C/C++, it needs to have "manual" memory management.

Doc Brown
A: 

Python's API oficially allows one to turn on or off delayed garbage collection - Check the documentation on the "gc" module of the standard library:

http://docs.python.org/library/gc.html

But that is not what makes it slow when compared with static languages - the dynamic nature of data itself is the mains responsible for the speed differences.

jsbueno
+1  A: 

So answering this part of the question:

Isn't the major concern about interpreted languages the non-deterministic delays (or space complexity when there isn't enough delay) of garbage collection? So why not just write something exactly like Java, but forces you free memory manually?

This might be a concern for some systems. Not so much a problem for other systems. Software running with garbage collection can allocate memory faster than systems that just call malloc. Of course you end up paying the time later at GC time.

Take a web-based system for example. You can allocate all the memory during the handling of a request, and the GC can collect afterwards. It might not end up working out quite like that, but you get the idea.

There are a lot of different strategies for garbage collection. Which strategy is best for the system will depend on the requirements. But even if you require absolute determinism, you can use something like: Realtime Java

Steve g
+1  A: 

Which interpreted languages have manual memory management? I don't recall ever hearing of one.

There is no such thing as an interpreted language. A language is neither compiled nor interpreted. A language just is. A language is a bunch of abstract mathematical rules. Interpretation or Compilation are traits of a language implementation, they have nothing to do with the language. Every language can be implemented by either a compiler or an interpreter; most modern high-performance language implementations actually use both and switch between them depending on which one is faster in a particular context.

Is C a compiled language? There are C interpreters out there. Is Python an interpreted language? All 8 current Python implementations use a compiler.

So, since every language can have an interpreted implementation, C and C++ are examples of interpreted languages with manual memory management. (And this is not just a theoretical hair-splitting contest, there are actually C and C++ interpreters out there. The VxWorks real-time operating system even contains one right in the kernel, and NASA once used this interpreter to fix a buggy kernel module on a spacecraft.)

Another example would be the very first version of Lisp from 1958: it had manual memory management (based on reference counting), but it was replaced only a couple of months later with a version with automatic memory management, which it has used ever since. Although again, any language can be implemented with either a compiler or an interpreter, so I don't know whether that version had an interpreted implementation or a compiled one. (In fact, I'm not sure whether it was implemented at all.)

If you relax your criteria a little bit and realize that memory management is only a special case of general resource management, then you will find that pretty much all languages, whether you want to call them compiled or interpreted or something else entirely, have some form of manual resource management for at least some kind of resource (file handles, database connections, network connections, caches, ...).

Jörg W Mittag
I think by interpreted he just meant scripting, like Python, Perl, etc... you're right, but this is the wrong place to bring up this discussion =P.
Claudiu
Indeed you are right, but this has nothing really to do with what I meant in my question. I'll try to be more specific of what I meant by interpreted language: "memory safe (not necessarily "script-ish") languages that are not close-to-metal like C/C++ that don't let you trash your address space unless you _really_ want to; languages that are most commonly JIT compiled and/or interpreted (and/or even compiled to byte code before that process (and or ahead-of-time compiled))."
Longpoke
+3  A: 

The premises behind the question are a bit dodgy:

  • The memory model is a property of the language, not its implementation.

  • Being interpreted is a property of an implementation, not a language.

Examples:

  • The programming language Scheme has automatic memory management, and it has many dozens of interpreted implementations, but also some fine native-code compilers including Larceny, Gambit, and PLT Scheme (which includes both an interpreter and a JIT compiler making seamless transitions).

  • The programming language Haskell has automatic memory managemend; the two most famous implementations are the interpreter HUGS and the compiler GHC. There are several other honorable implementations split about evenly between compiling to native code (yhc) and interpretation (Helium).

  • The programming language C has manual memory management, and while the world is full of C compilers, those of us old enough to remember the glorious 1980s may remember Saber-C or C-terp, two very useful C interpreters for MS-DOS.

Nevertheless there is a truthful observation behind your question: languages with manual memory management are typically compiled. Why?

  • Manual memory management is a legacy feature, often used to be compatible with legacy code. Legacy languages are typically mature enough to have native-code compilers.

  • Many new languages are defined by an implementation. It is easier to build an interpreter than to build a compiler. It is easier to implement simple automatic memory management in an interpreter than to implement high-performance automatic memory-management in a native-code compiler. So if the language gets its definition from its first implementation, automatic memory management correlates with interpretation because in the interpreted setting, the implementation is easier.

  • Manual memory management is also (and sometimes even justifiably) used to improve performance. Ben Zorn's excellent experimental studies from the 1990s show that automatic memory management is as fast or faster than manual memory management, but requires about twice as much memory. So manual memory management is often used on very small devices, where memory is scarce, and in very large data centers, where doubling memory is expensive. (It's also sometimes used by people who don't know much about memory management, but who have heard that garbage collection is slow. They were right in 1980.) And when there's a concern for performance you usually find a native-code compiler rather than an interpreter.

    Some of the really interesting exceptions also come from this principle. For example, both FORTH and the very first PostScript implementations were designed to run on small embedded devices (telescopes and printers) where memory resources were scarce but compute time was not a factor. Both languages were first implemented using bytecodes that were more compact than native code, and both featured manual memory management. So: interpreters with manual memory management. (Later versions of PostScript added an option for garbage collection.)

In summary:

  • Automatic versus manual memory management is the language.

  • Compiled vs interpreted is the implementation.

  • In principle the two choices can be and are made orthogonally, but for pragmatic engineering reasons automatic memory management frequently correlates with interpretation.

Isn't the major concern about interpreted languages the non-deterministic delays (or space complexity when there isn't enough delay) of garbage collection?

I wasn't aware that there was a major concern about interpreted implementations of programming languages. In alphabetical order, Lua, Perl, PostScript, Python, and Ruby are all wildly successful, and Icon, Scheme, and Squeak Smalltalk are moderately successful. The only area in which unpredictable delays cause concern is in hard real-time computing, like the ABS system that controls the brakes of your car (if you drive a sufficiently fancy car).


Note added after question was edited: You changed "interpreted" to "pointer-free". But you say in a comment that you mean to ask about languages with new and delete. Any language with new and delete has pointers: by definition, whatever new returns is a pointer. (In some languages, there may be other sources of pointers as well.) So I think what you mean to say is "languages without pointer arithmetic and without an address-of operator".

Norman Ramsey
Manual memory management as in `new` and `delete` operators. Trying to dereference a `delete`d object will cause an exception or some kind of other error rather than undefined behavior. Basically the same as Java but with a delete operator, per my example. You do have a good point about manual memory management being a legacy thing though.
Longpoke