ansaurus

Question

Building an expression with maximum value

Answer 1

+1 A:

You should be able to do this with dynamic programming. Let x_i be your input numbers. Then let M(a,b) be the maximum value you can get with the subsequence x_a through x_b. You can then compute:

M(a,a) = x_a
M(a,b) = max_i(max(M(a,i)*M(i+1,b), M(a,i)+M(i+1,b), M(a,i)-M(i+1,b))

edit:

I think you need to compute both the max and min computable value using each subsequence. So

Max(a,a) = Min(a,a) = x_a
Max(a,b) = max_i(max(Max(a,i)*Max(i+1,b),
                     Max(a,i)*Min(i+1,b),
                     Min(a,i)*Max(i+1,b),
                     Min(a,i)*Min(i+1,b),
                     Max(a,i)+Max(i+1,b),
                     Max(a,i)-Min(i+1,b))
...similarly for Min(a,b)...

Keith Randall 2010-09-23 19:48:52

That looks quadratic however. +1 anyway, looks good.

IVlad 2010-09-23 19:50:27

In fact this is cubic. Sorry, I missed your required time bound.

Keith Randall 2010-09-23 19:55:13

I don't think this works, since the problem doesn't have optimal substructure. Consider the input [-10, 10, -10], the maximum is 1000, but it doesn't come from maximums of subsequences.

Sheldon L. Cooper 2010-09-23 20:34:57

Actually guys that's `O(3^n)`, not `O(n^3)`. Since you have `n - 1` variables and 3 values for every variable, you have `n^3 - 1` possible solutions and this approach checks every single one of them.

Dave Aaron Smith 2010-09-23 20:42:53

@Sheldon - true, but it should be fixable by adding a third dimension : `M(a,b,c) = maximum value obtainable with the subsequence x_a through x_b, and the last operation is c`. @Dave - this would be implemented in the same way as the optimal matrix multiplication. It's cubic. http://en.wikipedia.org/wiki/Matrix_chain_multiplication - although I'll concede that I'm no longer 100% sure of its correctness. I still think there's a polynomial solution though.

IVlad 2010-09-23 20:45:55

Ok, matrix chain multiplication is different because the answer space is polynomial. In this algorithm, the answer space is exponential. The point of dynamic programming is to pre-compute sub-problems to avoid unnecessary recursion. That doesn't help though if you have exponentially many sub-problems.

Dave Aaron Smith 2010-09-23 21:02:13

@Dave - the way the algorithm is written, it will be cubic. Whether or not it will give the correct answer all the time, I'm not sure. I think that if we add a third dimension like I explained in my last comment, it will work. How do you know the answer space here is exponential and there's no approach (maybe other than the one this solution uses) that avoids unnecessary recursion?

IVlad 2010-09-23 21:08:34

The answer space is exponential because there are `n^(3-1)` possible answers. I'm saying you ARE avoiding unnecessary recursion, but it won't help because you're setting out to compute exponentially many answers and compare them all. Here's a small example to hopefully help your intuition. With 3 matrices A, B, and C, there are 2 possible answers. (A * B) * C and A * (B * C). With 3 numbers [a, b, c], there are 9 possible answers. a + b + c, a + b - c, a + b * c, and so on. Try it for 4 matrices and 4 numbers, see how they grow.

Dave Aaron Smith 2010-09-23 21:23:48

The number of ways n factors can be completely parenthesized is also exponential. More precisely it's C_{n-1}, where C is the sequence of Catalan numbers: http://en.wikipedia.org/wiki/Catalan_number#Applications_in_combinatorics

Sheldon L. Cooper 2010-09-23 21:38:05

@Dave - okay, but that is an informal argument. Also informally, I have posted an implementation of this method that, in my opinion, runs too fast for big tests to be exponential. @Sheldon - that doesn't mean anything by itself. Basically, you have to prove that you actually need to check ALL possibilities to get the optimum.

IVlad 2010-09-23 21:45:07

@IVlad: I was responding to Dave's claim that the number of ways to parenthesize matrices does not grow exponentially. That's false.

Sheldon L. Cooper 2010-09-23 22:02:36

@IVlad: the algorithm you proposed is obviously polynomial, since the number of subproblems is 3*n*(n+1)/2. But I'm not convinced about its correctness.

Sheldon L. Cooper 2010-09-23 22:25:27

@Sheldon - I am not entirely convinced myself. Hopefully more people will share their expertise.

IVlad 2010-09-23 22:30:13

@IVlad: I think your algorithm will break with input [0, 1, 2, -2]. The answer is ((0 - (1 + 2)) * (-2)) = 6, but your algorithm will respond 5.

Sheldon L. Cooper 2010-09-24 00:28:32

@Sheldon, you're right, this doesn't work with [-10,10,-10]. I think you might be able to fix it by computing both the max and min of subsequences. And it is definitely cubic, you need to compute O(n^2) M values and each one takes O(n) time. You can order them to avoid any repeat calculation, for instance do the M(a,b) in order from smallest b-a to largest b-a.

Keith Randall 2010-09-24 07:09:32

@Keith: Yes, that's what I implemented in my answer.

Sheldon L. Cooper 2010-09-24 12:57:49

@Sheldon, I stand corrected about the growth in the ways you can parenthesize matrices. Good catch.

Dave Aaron Smith 2010-09-24 15:08:54

@Sheldon - true, it's wrong. Good catch.

IVlad 2010-09-24 18:28:54

Answer 2

+1 A:

Work this in reverse polish - that way you don't have to deal with parentheses. Next put a - in front of every -ve number (thereby making it positive). Finally multiply them all together. Not sure about the complexity, probably about O(N).

EDIT: forgot about 0. If it occurs in your input set, add it to the result.

High Performance Mark 2010-09-23 19:49:39

watch out for numbers that `|x| < 2` though. They need to be added not multiplied.

KennyTM 2010-09-23 19:51:49

@Kenny -- I was just editing the question while you were pointing out my mistake. Thanks.

High Performance Mark 2010-09-23 19:52:49

I'd _hope_ that the problems disallows placing a - in front of every negative value - otherwise it becomes a trivial problem, not that it seems that complex from the outset.

Will A 2010-09-23 19:53:52

I can see reverse polish helping in a genetic algorithm approach, but I don't see how it would help in a deterministic approach. Also, you wouldn't multiply a string of all 1s, so multiplying them all after they're positive isn't going to work.

IVlad 2010-09-23 19:53:57

@Will A - true. Let's assume no unary minus, except for the first element. I'll edit this in.

IVlad 2010-09-23 19:57:36

Answer 3

A:

This feels NP Complete to me, though I haven't yet figured out how to do a reduction. If I'm right, then I could say

Nobody in the world knows if any polynomial algorithm exists, let alone O(n log n), but most computer scientists suspect there isn't.
There are poly time algorithms to estimate the answer, such as the genetic algorithm you describe.
In fact, I think the question you mean to ask is, "Is there a reasonably useful O(n) or O(n log n) algorithm to estimate the maximum value?"

Dave Aaron Smith 2010-09-23 20:16:29

No offense or anything, I know you're trying to help, but how can you say that when @Keith Randall posted a polynomial solution just a while ago? Polynomial algorithms that give exact answers definitely exist for this problem, I'm just looking for something efficient.

IVlad 2010-09-23 20:21:38

Hey @IVlad, I believe your runtime analysis of @Keith's solution is incorrect. It's exponential, not polynomial. Check my logic in his comments and see if I've made a mistake.

Dave Aaron Smith 2010-09-23 20:44:14

Answer 4

+3 A:

KennyTM 2010-09-23 20:48:59

+1, that's pretty nice. Your last two examples also serve as counterexamples however. We can get `6` for the last: `-2 * (-2 + (-1))`. Also, for `-5 -3 -2 0 1 -1 -1 6` we can do `-5 * (-3 + -2 + 0 - 1 + -1 + -1) * 6 = -30 * -8 = 240`.

IVlad 2010-09-23 21:04:29

@IVlad: Ha right. The addition/subtraction of ±1s should be done on the largest integer instead of the final product. I'll try to improve the algorithm tomorrow.

KennyTM 2010-09-23 21:08:09

Apparently we can get `540` for the second example. It works by multiplying the first three numbers, adding up all those until the `6` and multiplying with the first 3, then multiplying with the `6`.

IVlad 2010-09-23 21:46:11

Answer 5

+4 A:

Here's a proposed solution:

def max_result(a_):
  memo = {}
  a = list(a_)
  a.insert(0, 0)
  return min_and_max(a, 0, len(a)-1, memo)[1]

def min_and_max(a, i, j, memo):
  if (i, j) in memo:
    return memo[i, j]
  if i == j:
    return (a[i], a[i])
  min_val = max_val = None
  for k in range(i, j):
    left = min_and_max(a, i, k, memo)
    right = min_and_max(a, k+1, j, memo)
    for op in "*-+":
      for x in left:
        for y in right:
          val = apply(x, y, op)
          if min_val == None or val < min_val: min_val = val
          if max_val == None or val > max_val: max_val = val
  ret = (min_val, max_val)
  memo[i, j] = ret
  return ret

def apply(x, y, op):
  if op == '*': return x*y
  if op == '+': return x+y
  return x-y

max_result is the main function, and min_and_max is auxiliary. The latter returns the minimum and maximum results that can be achieved by sub-sequence a[i..j].

It assumes that maximum and minimum results of sequences are composed by maximum and minimum results of sub-sequences. Under this assumption, the problem has optimal substructure and can be solved with dynamic programming (or memoization). Run time is O(n^3).

I haven't proved correctness, but I have verified its output against a brute force solution with thousands of small randomly generated inputs.

It handles the possibility of a leading unary minus by inserting a zero at the beginning of the sequence.

EDIT

Been thinking a bit more about this problem, and I believe it can be reduced to a simpler problem in which all values are (strictly) positive and only operators * and + are allowed.

Just remove all zeroes from the sequence and replace negative numbers by their absolute value.

Furthermore, if there are no ones in the resulting sequence, the result is simply the product of all numbers.

After this reduction, the simple dynamic programming algorithm would work.

EDIT 2

Based on the previous insights I think I found a linear solution:

def reduce(a):
  return filter(lambda x: x > 0, map(abs, a))

def max_result(a):
  b = reduce(a)
  if len(b) == 0: return 0
  return max_result_aux(b)

def max_result_aux(b):
  best = [1] * (len(b) + 1)
  for i in range(len(b)):
    j = i
    sum = 0
    while j >= 0 and i-j <= 2:
      sum += b[j]
      best[i+1] = max(best[i+1], best[j] * sum)
      j -= 1
  return best[len(b)]

best[i] is the maximum result that can be achieved by sub-sequence b[0..(i-1)].

EDIT 3

Here's an argument in favor of the O(n) algorithm based on the following assumption:

You can always achieve the maximum result with an expression of the form

+/- (a_1 +/- ... +/- a_i) * ... * (a_j +/- ... +/- a_n)

That is: a product of factors composed of an algebraic sum of terms (including the case of only one factor).

I will also use the following lemmas which are easy to prove:

Lemma 1: x*y >= x+y for all x,y such that x,y >= 2

Lemma 2: abs(x_1) + ... + abs(x_n) >= abs(x_1 +/- ... +/- x_n)

Here it goes.

The sign of each factor doesn't matter, since you can always make the product positive by using the leading unary minus. Hence, to maximize the product we need to maximize the absolute value of each factor.

Setting aside the trivial case in which all numbers are zeroes, in an optimal solution no factor will be composed only of zeroes. Therefore, since zeroes have no effect inside each sum of terms, and each factor will have at least one non-zero number, we can remove all zeroes. From now on, let's assume there are no zeroes.

Let's concentrate in each sum of terms separately:

(x_1 +/- x_2 +/- ... +/- x_n)

By Lemma 2, the maximum absolute value each factor can achieve is the sum of the absolute values of each term. This can be achieved in the following way:

If x_1 is positive, add all positive terms and subtract all negative terms. If x_1 is negative, subtract all positive terms and add all negative terms.

This implies that the sign of each term does not matter, we can consider the absolute value of each number and only use operator + inside factors. From now on, let's consider all numbers are positive.

The crucial step, that leads to an O(n) algorithm, is to prove that the maximum result can always be achieved with factors that have at most 3 terms.

Suppose we have a factor of more than 3 terms, by Lemma 1 we can break it into two smaller factors of 2 or more terms each (hence, each add up to 2 or more), without reducing the total result. We can break it down repeatedly until no factors of more than 3 terms are left.

That completes the argument. I still haven't found a complete justification of the initial assumption. But I tested my code with millions of randomly generated cases and couldn't break it.

Sheldon L. Cooper 2010-09-24 02:45:21

Could you explain the logic behind the linear solution a bit? I also saw Nabb's post and was about to ask him the same thing, but apparently he deleted it. In the meantime I'll try to run some tests myself.

IVlad 2010-09-24 18:28:13

The key insight is that you'll never need to add groups of more than 3 consecutive numbers. That's why the inner while loop executes at most 3 times. And actually, you only need to add groups when there's at least a number one involved. I'll try to write a longer explanation later.

Sheldon L. Cooper 2010-09-24 18:35:28

@Sheldon - are you accounting for no unary minus? Why are you only working with the absolute value of every number?

IVlad 2010-09-24 19:15:06

@IVlad: because you can always make the product of groups positive by using the leading unary minus. Hence, you can reduce the problem by considering absolute values and using operators * and + only.

Sheldon L. Cooper 2010-09-24 19:39:50

@Sheldon L. Cooper Never more that 3 numbers ? False ! See 4 1 1 1 1 4Maximum is 4*4*4 = 64, you need to sum the 1....

Loïc Février 2010-09-25 11:27:35

@Loïc - No, it's not necessary: 4 * (1 + 1) * (1 + 1) * 4 = 64.

Sheldon L. Cooper 2010-09-25 12:00:59

@Sheldon L. Cooper Indeed...my mistake.

Loïc Février 2010-09-25 23:31:00

@Sheldon - thanks for the clarification.

IVlad 2010-09-26 22:01:05

Answer 6

A:

This is my first post on stackoverflow, so I apologize in advance for missing any preliminary etiquette. Also, in the interest of full disclosure, Dave brought this problem to my attention.

Here's an O(N^2logN) solution, mostly because of the the repeated sorting step in the for loop.

Absolute values: Remove zero elements and sort by absolute value. Since you are allowed to place a negative sign in front of your final result, it does not matter whether your answer is negative or positive. Only the absolute values of all numbers in the set matter.
Multiplication only for numbers > 1: We make the observation that for any set of positive integers greater than 1, (e.g. {2,3,4}), the largest result comes from a multiplication. This can be shown by an enumerative technique or a contradiction argument over permitted operations + and -. e.g. (2+3)*4 = 2*4 + 3*4 < 3*4 + 3*4 = 2*(3*4). In other words, multiplication is the most "powerful" operation (except for the 1s).
Addition of the 1s to the smallest non-1 numbers: For the 1s, since multiplication is a useless operation, we are better off adding. Here again we show a complete ordering on the result of an addition. For rhetoric sake, consider again the set {2,3,4}. We note that: 2*3*(4+1) <= 2*(3+1)*4 <= (2+1)*3*4. In other words, we get the most "mileage" from a 1 by adding it to the smallest existing non-1 element in the set. Given a sorted set, this can be done in O(N^2logN).

Here's what the pseudo-code looks like:

S = input set of integers;

S.absolute();
S.sort();

//delete all the 0 elements
S.removeZeros();

//remove all 1 elements from the sorted list, and store them
ones = S.removeOnes();

//now S contains only integers > 1, in ascending order S[0] ... S[end]
for each 1 in ones:
   S[0] = S[0] + 1; 
   S.sort();
end

max_result = Product(S);

snakebeater 2010-10-01 00:06:02

ansaurus

tags:

views:

answers:

Building an expression with maximum value

related questions