tags:

views:

113

answers:

4

I have been reading compilers will optimize for operations without side effects. I assume this is done very conservatively, but how does the compiler know. Does it have a look up table of side effect free operations or does it do it otherwise?

+4  A: 

It performs a static analysis of the code during compilation.

Remember, it knows if the code only does local calculations and doesn't do pointer math or uses globals or modifies data that gets passed in by reference somehow. Any functions it calls must also be non-side-effect(recurse this).

Paul Nathan
+1  A: 

The programmer might also give the compiler some hints: see for example noalias.

ChrisW
+1  A: 

Side effects are primarily a potential result of function calls.

In some cases, the possible side effects are encoded into parameter declarations. C even has a special qualifier called restrict, which, in addition to const and volatile gives direct information to the compiler about possible side effects.

So, if control passes outside the immediate expression being evaluated, via method or function calls, the compiler needs to either statically analyze the entire code path or make a determination based on the formal parameters.

In the worst case, the compiler may need to assume that anything in memory may have changed following a method or function call.

To arrange for the best possible optimization for your code, in computation-intensive areas of code use features of the language to make parameters as restricted as possible. Copy globals that are not modified by expressions to locals before use; this helps the compiler in several ways. Restrict the visibility of all objects as much as possible. (Use locals, then instance variables, then class variables, avoid global anything.)

DigitalRoss
+1  A: 

Simplistically, there is a list of side-effect free (aka pure) operations. But most operations are only kinda side effect free. For example, in Java, reading from an array is pure if you can statically tell that its in bounds. Many other analyses, such as Value Range Propagation, can help with this.

Typically, you want to tell if a function call is pure. For builtin functions, the compiler might just have a list (although, a function modelled by the compiler, such as sin in C, might be side-effect free in some cases, and might modify errno in others.

To tell if a function is pure, you must know that it doesn't write to memory which escapes - that is, global variables, class variables, or any memory accessible through the return value or parameters. This is called mod-ref analysis (aka alias analysis or pointer analysis, and the same basic idea as escape analysis). A simplified algorithm is:

  • collect all names (global x, y->f, *y, **y etc) in the program
  • construct a graph of names which may point to the same memory location
  • when a name is written to (ie *x = 5;) get a list of the other names which might also b written to
  • if one of those escapes from the function, then the function is not pure.

This isn't the greatest explanation, I think, suggestions and questions welcome.

Note that some languages, like C++, provide annotations for purity (a const function, not to be confused with pure virtual functions), and even allows some advanced features like mutable. I've been told that they aren't particularly useful for optimization though, but I've not written a C++ compiler so I don't know why. I would speculate that separate compilation ruins their fun, perhaps.

Paul Biggar