views:

192

answers:

3

I've got an idea for caching that I'm beginning to implement:

Memoizing functions and storing the return along with a hash of the function signature in Velocity. Using PostSharp, I want to check the cache and return a rehydrated representation of the return value instead of calling the function again. I want to use attributes to control this behavior.

Unfortunately, this could prove dangerous to other developers in my organization, if they fall in love with the performance gain and start decorating every method in sight with caching attributes, including some with side effects. I'd like to kick out a compiler warning when the memoization library suspects that a function may cause side effects.

How can I tell that code may cause side effects using CodeDom or Reflection?

A: 

Simply speaking you can't with either CodeDom or Reflection.

To accurately determine whether or not a method causes side effects you must understand what actions it is taking. For .Net that means cracking open the IL and interperting it in some manner.

Neither Reflection or CodeDom give you this capability.

  • CodeDom is a method for generating code into an application and only has very limited inspection capabilities. It's essentially limited to the subset of the language understood by the various parsing enginse.
  • Reflections strength lies in it's ability to inspect metadata and not the underlying IL of the method bodies. MetaData can only give you a very limited set of information as to what does and does not cause side effects.
JaredPar
"MetaData can only give you a very limited set of information as to what does and does not cause side effects."Any examples off the top of your head?
Chris McCall
@Chris, looking for DllImport and a couple of other items would tell you that a method is a PInvoke method and hence must be assumed to have side effecs.
JaredPar
A: 

Reflection in itself won't do it, because the metadata doesn't have any such attributes.

CodeDom may not be powerful enough to inspect all IL instructions.

So you'd have to use the very low-level pieces of the reflection API that let you get a byte[] containing the raw IL of each method, and analyze that. So it's possible in principle, but not easy.

You'd have to analyze all the instructions and observe what effects they have, and whether those effects are going to survive outside of some significant scope (e.g. do they modify the fields of objects that can leak out through return values or out parameters, or do they just modify transient objects that are guaranteed to be unreachable outside the method?).

Sounds pretty complicated!

Daniel Earwicker
It does sound pretty complicated, doesn't it :(I'd settle for string parsing, looking for keywords or other text constructs with regular expressions if I had to. It doesn't have to be perfect, but I think it would be a good idea to at least try to warn developers that they may be making a bad choice by memoizing the function.
Chris McCall
I seriously doubt regular expressions will be powerful enough to distinguish pure C# functions from side-effecting methods, to any useful extent. Remember, you need to know what APIs in the BCL have side-effects. `Console.WriteLine` does. Then there are the situations where a function returns a different value given the same arguments (or none), e.g. `Random.Next` or `DateTime.Now` - you don't want to be caching their results. How many other examples are there? I've no idea... I think this is THE major hole in the .NET framework and BCL today, and hopefully it will be a focus for version 5.
Daniel Earwicker
+7  A: 

This is an extremely hard problem, both in practice and in theory. We're thinking hard about ways to either prevent or isolate side effects for precisely your scenarios -- memoization, automatic parallelization, and so on -- but it's difficult and we are still far from a workable solution for C#. So, no promises. (Consider switching to Haskell if you really want to eliminate side effects.)

Unfortunately, even if a miracle happened and you found a way to prevent memoization of methods with side effects, you've still got some big problems. Consider the following:

1) What if you memoize a function that is itself calling a memoized function? That's a good situation to be in, right? You want to be able to compose memoized functions. But memoization has a side effect: it adds data to a cache! So immediately you have a meta-problem: you want to tame side effects, but only "bad" side effects. The "good" ones you want to encourage, the bad ones you want to prevent, and it is hard to tell them apart.

2) What are you going to do about exceptions? Can you memoize a method which throws an exception? If so, does it always throw the same exception, or does it throw a new exception every time? If the former, how are you going to do it? If the latter, now you have a memoized function which has two different results on two different calls because two different exceptions are thrown. Exceptions can be seen as a side effect; it is hard to tame exceptions.

3) What are you going to do about methods which do not have a side effect but are nevertheless impure methods? Suppose you have a method GetCurrentTime(). That doesn't have a side effect; nothing is mutated by the call. But this is still not a candidate for memoization because any two calls are required to produce different results. You don't need a side-effects detector, you need a purity detector.

I think your best bet is to solve the human problem via education and code reviews, rather than trying to solve the hard technical problem.

Eric Lippert
F# also can be written in a fairly pure fashion.
Paul Nathan
@Paul, but F# can't *ensure* purity either. As Eric says, consider switching to Haskell...
Benjol