ansaurus

Question

Register allocation and spilling, the easy way?

Answer 1

+2 A:

First: There is no smart way to do it. The problem is NP-complete ;-)

How spilling is done:

You run your register allocation algorithm and get a list of variables you have to spill. Now you can allocate some space on the stack at the beginning of your function. Link every spilled variable too a place on the stack. If you want to be smart coalesce memory with non-overlapping live ranges. Whenever you need to spill a register save it to memory and load it, when it is needed again.

How to handle eax:

Mark the register as filled, but do not store any variable in it (pre-allocation). This will make the code generator clear that register. To be smart store the value in another register if beneficial.

Easy and correct ways to handle spilling:

Just spill everything. This assume that every variable's live range is the whole program. This can be augmented by using stuff like LRU or usage count to choose which registers should be freed.

The next best thing to do is probably linear scan register allocation. It should be quite easy to implement even when using pre-allocation. I suggest you look into the linked paper.

Specific Answers

What does correctness mean for you? Even simple allocations algorithms are correct if you do not make a programming error. Proofing (mathematical) correctness is a lot more difficult. Both loads and stores need to be inserted before the value/register is needed again. Both need to be inserted after the value is stored/created.
Yes. If you program it that way. If your algorithm can handle a value in multiple registers during its livetime you can use those optimizations.
It's again up to you to implement certain improvements. One possibility would be to only block eax when it's needed, not for the whole program.
Under certain conditions SSA does help. Inference graphs of SSA code are always chordal, meaning that there is no cycle with more than 3 nodes. This is a special case of graph coloring, in which a minimal coloring can be found in polynomial time. Converting to SSA does not necessarily mean more or less register pressure. While SSA form has usually more variables, these tend to have smaller livetimes.

ebo 2009-12-25 22:09:48

Thanks ebo, that is a nicely written paper and the linear scan approach might be feasible (i.e. simple enough for me to understand!). You haven't really addressed the specific parts I'm thinking about; that's because I wasn't clear about them. I'll add more detail to the question. Allocating stack space is easy enough (and can be optimised to coalesce non-interfering variables). Likewise for eax: I can simply not consider it an available register, and only use it for return values (but this is a waste of a good register).

Edmund 2009-12-26 00:44:57

Answer 2

+1 A:

I've used a greedy approach in a JVM allocator once, which worked pretty well. Basically start at the top of a basic block with all values stored on the stack. Then just scan the instructions forward, maintaining a list of registers which contain a value, and whether the value is dirty (needs to be written back). If an instruction uses a value which is not in a register (or not in the correct register), issue a load (or move) to put it in a free register before the instruction. If an instruction writes a value, ensure it is in a register and mark it dirty after the instruction.

If you ever need a register, spill a used register by deallocating the value from it, and writing it to the stack if it is dirty and live. At the end of the basic block, write back any dirty and live registers.

This scheme makes it clear exactly where all the loads/stores go, you generate them as you go. It is easily adaptable to instructions which take a value in memory, or which can take either of two arguments in memory, but not both.

If you're OK with having all data on the stack at every basic block boundary, this scheme works pretty well. It should give results similar to linear scan within a basic block, as it basically does very similar things.

You can get arbitrarily complicated about how to decide which values to spill and which registers to allocate. Some lookahead can be useful, for example by marking each value with a specific register it needs to be in at some point in the basic block (e.g. eax for a return value, or ecx for a shift amount) and preferring that register when the value is first allocated (and avoiding that register for other allocations). But it is easy to separate out the correctness of the algorithm from the improvement heuristics.

I've used this allocator in an SSA compiler, YMMV.

Keith Randall 2010-01-04 22:48:18

Thanks Keith. I'm not sure quite how to adapt your scheme to my work: all my basic blocks are minimal, in that I have a CFG where each vertex is a single instruction. It would be possible to maximalise them (and then linearise the blocks) but I've gotten used to thinking about the existing model and all the other analysis phases depend on it. The greedy approach in definitely appealing -- I'm not competing for the most-efficient-compiler-ever prize :-).

Edmund 2010-01-11 04:58:48

The "everything in memory at basic block boundaries" can be translated to your situation pretty easily. Basically, arrange your basic blocks in any order you'd like, then process all of your instructions in one pass from beginning to end, flushing the register cache after every "non-local" (not from the immediately preceding instruction) in-edge, and writing back any dirty live data before every "non-local" out-edge.

Keith Randall 2010-01-12 04:59:52

ansaurus

tags:

views:

answers:

Register allocation and spilling, the easy way?

Example

Specific problems

Update

related questions