how does an optimizing c++ compiler reuse stack slots of a function?

views:

199

answers:

how does an optimizing c++ compiler reuse stack slots of a function?

How does an optimizing c++ compiler determine when a stack slot of a function(part of stack frame of a function) is no longer needed by that function, so it can reuse its memory? .
By stack slot I mean a part of stack frame of a function, not necessarily a whole stack frame of a function and an example to clarify the matter is, suppose we have a function that has six integer variables defined in its scope, when it's time to use sixth variable in the function, fifth variable's become useless so compiler can use same memory block for fifth and sixth variables.
any information on this subject is appreciated.

The easy part is: When a function exits, all local variables of that function are released. Thus, function exit indicates that the whole stack frame can be freed. That's a no-brainer, though, and you wouldn't have mentioned "optimizing compiler" if that's what you were after.

In theory, a compiler can do flow analysis on a function, find out which chunks of memory are used at what time, and perhaps even re-order stack allocation based on the order in which variables become available. Then, if new automatic variables are introduced somewhere in the middle of the function or other scope nested within the function (rather than at its beginning), those recently freed slots could be re-used.

In practice, this sounds like a lot of spinning gears, and I suspect that stack is simply allocated whenever variables come in scope and popped off en block by decrementing the stack pointer when the scope finishes. But I admit I'm no expert on this topic. Someone with more authoritative knowledge may come along and correct me.

Carl Smotricz 2010-07-12 22:06:35

Nothing wrong with the answer, except you haven't stated when the stack space is reserved. Of course it depends on the compiler but generally there's no reason not to reserve all the space immediately in the function prologue. It's not like sub esp, 0x10 erases 16 bytes or something, so there's no advantage in not reserving it all up front.

wj32 2010-07-12 22:11:38

A note to "flow analysis" part of your comment. Usually they can't be re-ordered or freed up before leaving the scope, since it might have break RAII.

doc 2010-07-12 22:16:25

The "flow analysis" would have to include destructors. But destructors can be called in the middle of a function (whenever any scope ends) or may be inline and not actually use the object (if trivial or simply counting live objects).

Ben Voigt 2010-07-12 22:24:04

@wj32, @doc, @Ben Voigt: You're both right, of course. It's barely conceivable that some compiler writer might strive to minimize stack useage under some very constrained conditions, e.g. a rarely entered IF block in mid-function using up many megabytes of local storage might negatively affect cache locality... but I agree, in general I'd expect what wj32 describes. Especially if you're not willing to be VERY careful about what doc mentions.

Carl Smotricz 2010-07-12 22:24:10

+3 A:

EDIT: I interpreted the question to mean, "how does the compiler reuse a particular memory word in the stack?" Most of the following answers that question, and a note a the end answers the question, "how does the compiler reuse all the stack space needed by a function?".

Most compilers don't assign stack slots first. Rather, what they do, for each function body, is treat each update to a variable, and all accesses to that variable that can see that particular assignment, as a so-called variable lifetime. A variable which is assigned multiple times will thus cause the compiler to create multiple lifetimes.

(There are complications with this idea that occur when multiple assignments can reach an access through different control paths; this is solved by using a clever enhancement to this idea called static single assignment, which I'm not going to discuss here).

At any point in the code, there are a set of variable lifetimes that are valid; as you choose differnt code points, you have different valid variable lifetimes. The compiler's actual problem is to assign different registers or stack slots of each of the lifetimes. One can think of this as a graph-coloring problem: each lifetime is a node, and if two lifetimes can overlap at a point in the code, there is an "interference" arc from that node to the other node representing the other lifetime. You can color the graph (or equivalently use numbers instead of colors), such that no two nodes connected by an interference arc have the same color (number); you may have to use arbitarily large numbers to do this, but for most functions the numbers don't have to be very large. If you do this, the colors (numbers) will tell you a safe stack slot to use for the assigned value of the particular variable lifetime. (This idea is normally used in roughly two phases: once to allocate registers, and once to allocate stack slots for those lifetimes that don't fit into the registers).

By determining the largest number used as a color on the graph, the compiler knows how many slots are needed in the worst case, and can reserve that much storage at function entry time.

There's lots of complications: different values take different amounts of space, etc., but the basic idea is here. Not all compilers use the graph coloring technique, but almost all of them figure out how to assign stack slots and registers in a way to avoid the implied interference. And thus they know stack slot numbers and the size of the stack frame.

EDIT... while typing, it appears that the question has been interpreted as "when does the stack frame for a function vanish"? The answer is, at function exit. The compiler already knows how big it is. It has no need to push or pop onto the stack during function execution; it knows where to put everything based on the stack slot numbering determined by the graph coloring.

Ira Baxter 2010-07-12 22:12:55

nice technical description though since I'm not an expert in compiler field I couldn't understand that graph algorithm well, thanks anyway.

Pooria 2010-07-14 05:02:31

That graph stuff is just a way to be precise. Use a piece of paper, and simulate what this says for a 1 page algorithm. That will make the intuition clearer. Once you see that, the graph scheme will seem pretty obvious. If you really want to dig into it, go read about "Register Allocation by Graph Coloring". That'll be *really* precise, but it will have pretty graph pictures all drawn in it for you.

Ira Baxter 2010-07-14 06:11:01

If I understand the question correctly, this is about call chaining, i.e. invoking function from function without allocating new stack frame.

This is possible when the call can be transformed into tail call - the last op before return. This way all local variables (stack) are already out of scope, so the space can be reused. Compiler then generates a jump instead of call instruction. The original return return address is still at the proper place on the stack.

I probably glossed over lots of details here, but that's the idea.

Nikolai N Fetissov 2010-07-12 22:31:01

ansaurus

tags:

views:

answers:

how does an optimizing c++ compiler reuse stack slots of a function?

related questions