views:

439

answers:

2

Dear overflowers,

I've been looking at LLVM for quite some time as a new back-end for the language I'm currently implementing. It seems to have good performance, rather high-level generation APIs, enough low-level support to optimize exotic optimizations. In addition, and although I haven't checked it myself, Apple seems to have successfully demonstrated the use of LLVM for garbage-collected multi-core programs.

So far, so good. As I'm interested in both garbage-collection and multi-core, the next step would be to choose a LLVM multi-core-able garbage-collector. Which brings me to the question: what is available? I'm aware of Jon Harrop's HLVM work, but that's about it.

Note that I need cross-platform, so Apple's GC is probably not what I'm looking for (unless there's a cross-platform version). Also note that I have nothing against stop-the-world garbage-collectors.

Thanks in advance, Yoric

+3  A: 

LLVM docs say that it does not support multi-threaded collectors yet.

As the matrix indicates, LLVM's garbage collection infrastructure is already suitable for a wide variety of collectors, but does not currently extend to multithreaded programs. This will be added in the future as there is interest.

The docs do say that to do multi-threaded garbage collection you need to stop the world and that this is a non-portable thing:

Threaded Denotes a multithreaded mutator; the collector must still stop the mutator ("stop the world") before beginning reachability analysis. Stopping a multithreaded mutator is a complicated problem. It generally requires highly platform specific code in the runtime, and the production of carefully designed machine code at safe points.

However, shared state between threads is a nasty scaling issue. If your language communicates solely through message passing between 'tasks', and therefore there was no shared state between worker threads, then you could use a per-thread collector for the per-thread heap?

Will
Well, that's what the LLVM docs say. However, Apple seems to have a multi-core-compatible gc, and so does the HLVM project.
Yoric
And, as it turns out, in my language, concurrency is purely message-based indeed. Still, I wonder how much support LLVM provides for a per-thread collector/per-thread heap.
Yoric
the docs leave a loophole for Apple while making it sound non-portable; I'll updte the answer
Will
Their claim that "stopping a multithreaded mutator is a complicated problem. It generally requires highly platform specific code in the runtime, and the production of carefully designed machine code at safe points" is not true. HLVM stops multiple mutators using only POSIX threads and without any custom machine code at all.
Jon Harrop
Everone who ever looked into the Boehm Weisser Garbage Collector which works fine and well tested knows that "Stop the world" is a very very simple thing to implement. The required lines of codes are less then a few hundert per platform.
Lothar
+1  A: 

The quotes that Will gave are about LLVM's intrinsic support for GC, where you augment LLVM with C++ code telling it how to walk the stack, interpret stack frames, inject read and write barriers and so on. The primary goal of my HLVM project is to become useful with minimal effort and risk so I chose to use the shadow stack for an "uncooperative environment" in order to avoid hacking on immature internals of LLVM. Consequently, those statements about LLVM's intrinsic support for GC do not apply to HLVM's garbage collector because it does not use that infrastructure at all. My results are extremely compelling: you can achieve excellent performance with minimal effort (serial performance and parallel performance).

I believe HLVM already runs out-of-the-box across Unixs including Mac OS X because it requires only POSIX threads. I strongly disagree with the claim that writing a stop-the-world GC is difficult: it took me 5 days to write a 100-line multicore garbage collector and I barely know anything about computers. I cannot believe it would be difficult to port to Windows either.

Jon Harrop
According to LLVM's documentation, the shadow stack is not thread-safe. I'm curious, how do you handle parallelism in that context?
Yoric
Presumably that is referring to LLVM's shadow stack but I'm using my own and not theirs. However, I'm curious as to how a shadow stack could not be thread safe. Each mutator thread has its own thread-local shadow stack that it mutates freely except when all mutators are paused for a GC whereupon the GC thread (which is whichever mutator incurred the GC in my case) reads all of the shadow stacks. So it is obviously "thread safe".
Jon Harrop