views:

341

answers:

4

I'm designing a high level language, and I want it to have the speed of C++ (it will use LLVM), but be safe and high level like C#. Garbage collection is slow, and new/delete is unsafe. I decided to try to use "region based memory management" (there are a few papers about it on the web, mostly for functional languages). The only "useful" language using it is Cyclone, but that also has GC. Basically, objects are allocated on a lexical stack, and are freed when the block closes. Objects can only refer to other objects in the same region or higher, to prevent dangling references. To make this more flexible, I added parallel regions that can be moved up and down the stack, and retained through loops. The type system would be able to verify assignments in most cases, but low overhead runtime checks would be necessary in some places.

Ex:

region(A) {
    Foo@A x=new Foo(); //x is deleted when this region closes.
    region(B,C) while(x.Y) {
        Bar@B n=new Bar();
        n.D=x; //OK, n is in lower region than x.
        //x.D=n; would cause error: x is in higher region than n.
        n.DoSomething();
        Bar@C m=new Bar();
        //m.D=n; would cause error: m and n are parallel.
        if(m.Y)
            retain(C); //On the next iteration, m is retained.
    }
}

Does this seem practical? Would I need to add non-lexically scoped, reference counted regions? Would I need to add weak variables that can refer to any object, but with a check on region deletion? Can you think of any algorithms that would be hard to use with this system or that would leak?

A: 

How would it return a dynamically created object? Who would "own" it and be responsible for freeing the memory?

Refcounting or GC are so common because they are almost always the best choices. Generational garbage collectors can be very efficient.

sysrqb
To return an allocated object, the caller would have to pass a region for it to be allocated in as an argument. The caller would then "own" it, and it would be freed when that region closes.
Zifre
+1  A: 

If I were implementing a language with region based memory management, I would probably read A language-independent framework for region inference. That said, it's been a while since I looked into this stuff, and I'm sure the state of the art has moved on, if I ever even knew what the state of the art was.

Logan Capaldo
+1  A: 

Well you should go study Apples memory management. It has release pools and zones, which sure sound a lot like what you're doing here.

I won't comment on the "GC is slow" remark,

Will Hartung
+1  A: 

I would discourage you from trying regions. The problem is that in order to make regions guaranteed to be safe, you need a very sophisticated type system---I'm sure you've looked at the papers by Tofte and Talpin and you have an idea of the complexities involved. Even if you do get regions working successfully, the chances are very hight that your program will require a whose lifetime is the lifetime of the program---and that region at least has to be garbage collected. (This is why Cyclone has regions and GC.)

Since you're just getting started, I'd encourage you to go with garbage collection. Modern garbage collectors can be made pretty fast without a lot of effort. The main issue is to allocate from contiguous free space so that allocation is fast. It helps to be targeting AMD64 or other machine with spare registers so you can use a hardware register as the allocation pointer.

There are lots of good ideas to adapt; one of the easiest to implement is a page-based collector like Joel Bartlett's mostly-copying collector, where the idea is you allocate only from completely empty pages.

If you want to study existing garbage collectors, Lua has a fairly sophisticated incremental garbage collector (so there are no visible pause times) and the implementation is only 700 lines. It is fast enough to be used in a lot of games, where performance matters.

Norman Ramsey