views:

168

answers:

3

I wonder if the objective function of a general dynamic programming problem can always be formulated as in dynamic programming on wiki, where the objective function is a sum of items for action and state at every stage? Or that is just a specical case and what is the general formulation?


EDIT:

By "dynamic programming problem", I mean a problem that can be solved by dynamic programming technique. Such kind of problems possess the property of optimal problem and optimal structure.

But at lease for me it is sometimes not easy to identify such problems, perhaps because I have not become used to that kind of verbal description. When I came across the WIKI page for Bellman equation, I do feel mathematical formulation of the cost function will help somehow. I suspect the overall cost/gain function can always be represented as accumulation of cost/gain from all the stages? and the accumulation can be additive or multiplitive or something else?

When I posted my question, I did realize that it is more proper to discuss dynamic programming in some place more oriented to mathematical optimization. But there are quite a lot of discussion of computer algorithms in Stackoverflow.com. So I did not feel improper to ask my question here either.

+2  A: 
Jason Orendorff
Thanks, Jason! I just ignore the βt just like you set it to be 1. I interpret F as the gain/cost from one stage to the next stage. In the hindsight, I think the overall cost/gain function can always be represented as accumulation of cost/gain from all the stages? and the accumulation can be additive or multiplitive or something else?
Tim
Yes, there are many ways you could break the objective function down into per-stage pieces. The formula you cited breaks it into additive pieces; this has the advantage that it can describe every possible objective function. You could break an objective function down into per-stage factors instead, but not *every* objective function: if any move puts the objective at 0 and the next move puts it at 1, there's no possible value of *F* for that second move.
Jason Orendorff
+1  A: 

Hi,

in computer science dynamic programming denotes the building of any algorithm in terms of recursively splitting it into subproblems when the same subproblems appear many times in this recursive expansion. A simple book example, Fibonacci numbers can be calculated using dynamic programming:

From the generic recurrence F(n) = F(n-1) + F(n-2) you could implement the following algorithm:

int fibonacci(n):
  if (n < 2): return 1
  else: return fibonacci(n-1) + fibonacci(n-2)

Now this is of course not efficient at all, because it creates a huge number of recursive calls, e.g.

F(8) = F(7) + F(6) = [F(6) + F(5)] + [F(5) + F(4)] = ... 

So here we already see that fibonacci(5) is computed twice by the implementation. The dynamic programming paradigm is now to "memoize" or "cache" the results, like this:

integer_map store;
int memofibo(n):
  if (n < 2) : return 1
  else if (store.find_key(n)): return store.find_value(n)
  else:
    int f = memofibo(n-1) + memofibo(n-2)
    store.set(n, f)
    return f

This implementation ensure that the recursive step is executed at most once for every argument value of n, so it calculates the nth Fibonacci number in O(n log n) time (assuming standard O(log n)) implementation of the associative array 'store'.

So from the computer science perspective, the link you provided is the operations research / optimization problem version of the same idea (dividing problem into subproblems), but the idea has been abstracted in practice to this recursion + memoization pattern in the domain of general computer science. I hope this helps to clear some of the clouds.

antti.huima
Thanks, antti! I appreciate your reply about the detail of the implementation. Yes I agree the question is more of operation research and mathematical background.
Tim
+1  A: 

Folks,

There's a new(ish) website that concentrates on operations research questions here but the low volume of traffic there may not get you a good answer very quickly.

Soapbox time:

For those who care to debate on what's appropriate for stack overflow, let us note that an algorithm is an algorithm regardless of who claims it as part of their field. The simplex method, Djikstra's method, branch and bound, lagrangian relaxation, are all algorithms or methods of solving certain types of problems. Many of these are taught and applied in both fields so the border between OR and CS is pretty blurry in this area.

For instance (and a very strong instance it is) the undergrad course in algorithms at MIT includes all of the following - Randomized Competitive Algorithm, Dynamic Programming, Greedy Algorithms, Minimum Spanning Trees, Shortest Paths, Dijkstra's Algorithm, Bellman-Ford, Linear Programming, Depth-first Search, Topological Sort, and All-pairs Shortest Paths among other topics. I'll defer to MIT in this case.

I like stack overflow because many programmers recognize an optimization problem when they encounter it, but often they just need a little help in deciding how to formulate the problem or even what the problem is called by name.

Grembo