views:

56

answers:

1

Moving garbage collectors, such as generational collectors, incur extra generated code to store and reload references across GC safe points. Has anyone quantified the performance cost of this overhead compared to a non-moving collector?

I ask because I am interested in designing a collector that can recycle short-lived values efficiently without having to move them.

+1  A: 

The moving vs. non-moving tradeoff is a complex one. I'm not aware of any particular studies of the aspect you mention - indeed, I'm not sure I understand it completely: an accurate GC always needs to know where all the pointers are, so perhaps you're talking about conservative non-moving GC? Conservative GC in my opinion is a bad choice if you have enough control over the compilation path to be able to do accurate GC.

The other aspects affecting performance of moving vs. non-moving GC are:

  • Allocation speed. Non-moving GC might have to allocate from a free list, whereas coying GC can use bump-pointer allocation. Hybrid schemes like Immix attempt to strike a better tradeoff between the two.

  • Locality and cache behaviour. There are lots of studies on this for both moving and non-moving GC; see the GC Bibliography. Generally speaking compaction is good for the cache, although copying GC is typically breadth-first which is a bad choice (access patterns tend to be depth-first) so there's a bunch of research in this area trying to rearrange objects to match the access pattern.

My own personal opinion is that to support really fast allocation you need an L2-sized nursery with copying collection and bump-pointer allocation. If you do anything else, allocation becomes more expensive which skews lots of things: optimising to reduce allocations becomes more important, so you end up spending effort there and making things more complex. I'd rather make allocation really really cheap and then not worry about it.

Simon Marlow
I see, that was my impression too. I was talking about accurate (not conservative) non-moving GCs like the one in HLVM at the moment. It knows exactly where all of the global roots and in-heap pointers are but, because they never move, it can keep them in registers across safe points and never has to iterate over them updating them after a moving collection. Hmm, there must be some way to get the best of both worlds... maybe you'd need to indirect all references to moving objects through a global LUT...
Jon Harrop