ansaurus

Question

What kinds of functions are candidates for memoization?

Answer 1

+2 A:

Well, any function, theoretically, is a candidate for memoization. However, remember that memoization is all about trading space for speed -

In general, this means that, the more state a function requires or depends on in order to compute an answer, the higher the space cost, which reduces the desirability to memoize the method.

Both of your examples are basically cases where more state would need to be saved. This has two side effects.

First, this will require a lot more memory space in order to memoize the function, since more information will need to be saved.

Second, this will potentially slow down the memoized function, since the larger the space, the higher the cost of the lookup of the answer, as well as the higher the cost in finding whether or not a result was previously saved.

In general, I tend to only consider functions that have few possible inputs and low storage requirements, unless there is a very high cost in calculating the answer.

I acknoledge that this is vague, but this is part of the "artistry" in architecture. There's no "right" answer without implementing both options (memozied and non-memoized functions), profiling, and measuring.

Reed Copsey 2009-07-29 17:55:50

Can you point me to a document or library that discusses memoizing big-data functions? That is, functions with large parameters or required instance variables?

Chris McCall 2009-07-29 19:53:23

I haven't seen any - although I've done this. The key to making this effective is having a very good (fast and low conflict) hash generation and using a Dictionary<parameters,result> on your required parameters. This makes the lookup and storage easier, but keeps it somewhat efficient. If you do that, it's pretty easy to memoize any function - just keep track of the information and the answer, and check your results.

Reed Copsey 2009-07-29 20:15:21

...or, in the case of an instance method that requires data from inside it, Dictionary<instance, parameters, result>?

Chris McCall 2009-07-29 20:25:15

Chris: I'd put all of the information into a custom "parameters" structure, including the instance info. You need to have value types for everything that the function requires, and a good, single hash code to work from there. This should include instance data + paramter data.

Reed Copsey 2009-07-29 20:44:32

Answer 2

+1 A:

You can only memoize a function if all of its inputs are value types or immutable reference types, if it returns either a value type or a new instance of a reference type, and if it has no side effects. Period.

Memoization depends on a deterministic mapping between inputs and outputs. Every call to F(a, b, c) in which a, b, and c contain the same values must return the same result in order for memoization to be possible.

If a parameter's a reference type, then even though its value doesn't change, multiple calls to the function using it may produce a different result. A trivial example:

public int MyFunction(MyType t)
{
   return t.Value;
}

Console.WriteLine(MyFunction(t));
t.Value++;
Console.WriteLine(MyFunction(t));

Similarly, if a function depends on a value external to it, then multiple calls to that function with the same parameters can return different results:

int Value = 0;

public int MyFunction(int input)
{
   return Value;
}

Console.WriteLine(MyFunction(1));
Value++;
Console.WriteLine(MyFunction(1));

And heaven help you if your memoized function does something other than return a value or a new reference type:

int Value = 0;

public int MyFunction(int input)
{
   Value++;
   return input;
}

If you call that function 10 times, Value will be 10. If you refactor it to use memoization and then call it 10 times, Value will be 1.

You can start going down the path of figuring out how to memoize state, so that you can phony up a function that memoizes a reference type. But what you're really memoizing is the set of values that the function works on. You can similarly hack a memoized function that has side effects so that its side effects occur before the memoization. But this is all begging for trouble.

If you want to implement memoization into a function that takes a reference type, the proper approach is to refactor out the part of the function that only works on value types, and memoize that function, e.g.:

public int MyFunction(MyType t)
{
   return t.Value + 1;
}

to this:

public int MyFunction(MyType t)
{
   return MyMemoizableFunction(t.Value);
}

private int MyMemoizableFunction(int value)
{
   return value + 1;
}

Any other approach to implementing memoization that you take either a) does the same thing, through more obscure means, or b) won't work.

Robert Rossney 2009-07-29 20:27:09

This answer could use some clarification: you can certainly memoize a function if it deals only in *immutable* reference types. The downer is that there is no hard concept of an immutable reference type in .NET (yet). But it is certainly possible to construct them; hence while your answer does have a valid insight behind it, the opening sentence is basically untrue.

Daniel Earwicker 2009-07-29 20:32:59

I'm gonna take this answer anyway. In my head, I changed "can" to "should" in the first sentence.

Chris McCall 2009-07-29 21:38:51

Quite right. Also, I should have mentioned the issue of side effects in the first sentence, since memoizing a function that has side effects is a Very Bad Thing.

Robert Rossney 2009-07-29 21:59:41

This isn't exactly true - you can memoize a function that works with reference types, even mutable reference types, as long as you only work with value types contained within those reference types. This is basically teh same as refactoring out the values from your references, but technically, it works fine. You need to make sure that your memoization takes this into account, though.

Reed Copsey 2009-07-29 23:17:01

That's what I meant by "does the same thing, through more obscure means."

Robert Rossney 2009-07-29 23:43:26

@Robert - at least remove the word "period"!

Daniel Earwicker 2009-07-30 08:25:07

Answer 3

A:

You already have thought of a way to provide an AOP solution to provide memoization around function Foo, so what is there left to figure out?

Yes, you can pass an object of arbitrary complexity as a parameter to a memoized function, as long as it is immutable, as are all the things it depends upon. Again, this is not at all easy to discover statically at the moment.

Are you still wedded to the idea that you can statically examine the code in order to advise your users on the question "Is it a good idea to apply memoization to function Foo?"

If you make that one of your requirements, you will be joining a global research effort that has lasted many years so far. Depends on how ambitious you are, I guess.

Daniel Earwicker 2009-07-29 20:43:29

ansaurus

tags:

views:

answers:

What kinds of functions are candidates for memoization?

related questions