views:

1600

answers:

15

Given an n×n matrix of real numbers. You are allowed to erase any number (from 0 to n) of rows and any number (from 0 to n) of columns, and after that the sum of the remaining entries is computed. Come up with an algorithm which finds out which rows and columns to erase in order to maximize that sum.

+9  A: 

Well the brute force method goes something like this:

  • For n rows there are 2n subsets.
  • For n columns there are 2n subsets.
  • For an n x n matrix there are 22n subsets.

0 elements is a valid subset but obviously if you have 0 rows or 0 columns the total is 0 so there are really 22n-2+1 subsets but that's no different.

So you can work out each combination by brute force as an O(an) algorithm. Fast. :)

It would be quicker to work out what the maximum possible value is and you do that by adding up all the positive numbers in the grid. If those numbers happen to form a valid sub-matrix (meaning you can create that set by removing rows and/or columns) then there's your answer.

Implicit in this is that if none of the numbers are negative then the complete matrix is, by definition, the answer.

Also, knowing what the highest possible maximum is possibly allows you to shortcut the brute force evaluation since if you get any combination equal to that maximum then that is your answer and you can stop checking.

Also if all the numbers are non-positive, the answer is the maximum value as you can reduce the matrix to a 1 x 1 matrix with that 1 value in it, by definition.

Here's an idea: construct 2n-1 n x m matrices where 1 <= m <= n. Process them one after the other. For each n x m matrix you can calculate:

  1. The highest possible maximum sum (as per above); and
  2. Whether no numbers are positive allowing you to shortcut the answer.

if (1) is below the currently calculate highest maximum sum then you can discard this n x m matrix. If (2) is true then you just need a simple comparison to the current highest maximum sum.

This is generally referred to as a pruning technique.

What's more you can start by saying that the highest number in the n x n matrix is the starting highest maximum sum since obviously it can be a 1 x 1 matrix.

I'm sure you could tweak this into a (slightly more) efficient recursive tree-based search algorithm with the above tests effectively allowing you to eliminate (hopefully many) unnecessary searches.

cletus
I wonder if there is any smarter way to do it without brute forcing it... can't think of anything yet.
Dani
Your answer is correct, btw ;-) See http://stackoverflow.com/questions/1720737/from-an-interview-removing-rows-and-columns-in-n-by-n-matrix-to-maximize-sum/1721084#1721084
Pavel Shved
from the question I conclude that if all numbers are negative, removing all rows and columns yields a maximum value of 0 which is better than a 1x1 matrix with a negatif number :-)
rsp
It depends on if a matrix with a dimension of 0 is legal but yes this had occurred to me too. :) I assumed it wasn't valid.
cletus
Building on some excellent observations regarding the nature of the problem we can think of an algorithm that looks something likes this:sum all non-negative elements; for every subset of those rows and columns that contain negative values: sum the matrix formed by removing those until a max is found (short-circuit on any that equals the previously computed optimum)This is potentially much more efficient than a brute force of all subsets (if the number of rows/columns containing negative numbers is small).Obviously we could invert this based on the ratio of neg. containing rows/colums.
Jim Dennis
+1  A: 
  • Create an n-by-1 vector RowSums, and an n-by-1 vector ColumnSums. Initialize them to the row and column sums of the original matrix. O(n²)
  • If any row or column has a negative sum, remove edit: the one with the minimum such and update the sums in the other direction to reflect their new values. O(n)
  • Stop when no row or column has a sum less than zero.

This is an iterative variation improving on another answer. It operates in O(n²) time, but fails for some cases mentioned in other answers, which is the complexity limit for this problem (there are n² entries in the matrix, and to even find the minimum you have to examine each cell once).

Edit: The following matrix has no negative rows or columns, but is also not maximized, and my algorithm doesn't catch it.

1     1     3        goal     1    3
1   -89   101        ===>     1  101
1   102   -99

The following matrix does have negative rows and columns, but my algorithm selects the wrong ones for removal.

 -5     1    -5      goal     1
  1     1     1      ===>     1
-10     2   -10               2

                     mine
                     ===>     1     1     1
280Z28
I will keep it downvoted until a proof that it works appears. Your answer without a proof is totally useless.
Pavel Shved
@Pavel: It's an interview question, not a math class.
280Z28
The essence of this interview question is math. Read it carefully: "**in order to maximize that sum**". Why does your algorithm maximizes the sum? It's unclear why it should. You just remove some rows and columns but you can't tell that you'll reach the goal of maximizing.
Pavel Shved
This will not work. The counter-example you gave in another answer holds true. The problem is that removing rows can turn columns that are candidates for removal into columns that should not be removed (by taking away the most negative terms) and viceversa. This algorithm does not take into account this fact as a criterion for early removal of a column/row.
David Rodríguez - dribeas
Same as in Alon answer: It doesn't work (always), because in a certain step you may delete "good" values (see example in Pavels comment to Alons answer). It has to be improved in order to check the consquences of each step and the possibility to revoke a step.
Kai
@dribeas, kai, Pavel: I was working on proving it and I realized that "it" in my proof was not properly described in my algorithm.
280Z28
@Pavel: When i read "maximizing the sum", i read it like "we should get the highest possible value but it's ok if it fails for a few edge cases"
RCIX
@RCIX: Right now I can't find a counterexample to this finding the true maximum, but proving it is ... proving challenging.
280Z28
@RCIX when I read "maximize the sum", I read it as "maximize the sum". However, the problem in subject is NP-complete (http://stackoverflow.com/questions/1720737/from-an-interview-removing-rows-and-columns-in-n-by-n-matrix-to-maximize-sum/1721084#1721084) and anyway a technique that fails in some cases should work. But you should the elaborate what cases it fails at and how bad the result is.
Pavel Shved
'[-5 a -5] [b c d] [-10 e -10]' Assuming all letters are non-negative, to remove all negative numbers you can remove rows 1 and 3 or columns 1 and 3. The difference between row removal and column removal is (a+e)-(b+d). If a==b==d and e==d+1, then the solution is removing columns, but the step 2 heuristic (most negative sum) will tell you to remove by rows as -10 + e -10 is the most negative result.
David Rodríguez - dribeas
Upvoted for concise example where it doesn't work
Moishe
A: 

I cannot really produce an algorithm on top of my head, but to me it 'smells' like dynamic programming, if it serves as a start point.

David Rodríguez - dribeas
+11  A: 

The problem is NP-hard. (So you should not expect a polynomial-time algorithm for solving this problem. There could still be (non-polynomial time) algorithms that are slightly better than brute-force, though.) The idea behind the proof of NP-hardness is that if we could solve this problem, then we could solve the the clique problem in a general graph. (The maximum-clique problem is to find the largest set of pairwise connected vertices in a graph.)

Specifically, given any graph with n vertices, let's form the matrix A with entries a[i][j] as follows:

  • a[i][j] = 1 for i == j (the diagonal entries)
  • a[i][j] = 0 if the edge (i,j) is present in the graph (and i≠j)
  • a[i][j] = -n-1 if the edge (i,j) is not present in the graph.

Now suppose we solve the problem of removing some rows and columns (or equivalently, keeping some rows and columns) so that the sum of the entries in the matrix is maximized. Then the answer gives the maximum clique in the graph:

  1. Claim: In any optimal solution, there is no row i and column j kept for which the edge (i,j) is not present in the graph. Proof: Since a[i][j] = -n-1 and the sum of all the positive entries is at most n, picking (i,j) would lead to a negative sum. (Note that deleting all rows and columns would give a better sum, of 0.)

  2. Claim: In (some) optimal solution, the set of rows and columns kept is the same. This is because starting with any optimal solution, we can simply remove all rows i for which column i has not been kept, and vice-versa. Note that since the only positive entries are the diagonal ones, we do not decrease the sum (and by the previous claim, we do not increase it either).

All of which means that if the graph has a maximum clique of size k, then our matrix problem has a solution with sum k, and vice-versa. Therefore, if we could solve our initial problem in polynomial time, then the clique problem would also be solved in polynomial time. This proves that the initial problem is NP-hard. (Actually, it is easy to see that the decision version of the initial problem — is there a way of removing some rows and columns so that the sum is at least k — is in NP, so the (decision version of the) initial problem is actually NP-complete.)

Pavel Shved
Sounds smart, but it's not an answer... In fact it has not so much to do with the question. What you do is constructing one special matrix of the class mentioned in the initial problem that represent a graph. The positive entries of this matrix represent the *edges* of the graph (not vertexes with special property!). If you have a look at the matrix constructed by a simple square (corners = vertexes) you'll see there is not much connexion between the solution matrix for our initial problem (there are two) and the maximal clique (there are four).
Kai
Sorry, I failed.
Pavel Shved
With your a edits I can now suggest an analogy. However your values seem a bit arbitrary to me. Why not using 1 for 'being connected' (or 'edge') (and assuming every vertex is connected to itself) and -1 for 'not connected' (or 'no edge')?
Kai
The values need to be cumbersome because we need the maximal sub-matrix to be a sub-graph (i.e. address the issue in your first comment).
Pavel Shved
@Pavel, can you show that it's possible to verify a maximized matrix in P time?
280Z28
Great work. I was about to post a proof, but you fixed yours in the meantime. :-) The reduction (or at least the wording) could be greatly simplified... is it ok if I edit your answer, or post another version of the same proof? (If not, a couple of suggestions: you can just put a[i][j]=0 for (i,j) not an edge, and a[i][i]=n*n. Also, the word is "struck" not "stricken".)
ShreevatsaR
@280Z28, wwww, sorry, I completely forgot about it. A usual and shameful mistake. I'll just weaken my claims to "NP-hard" and modify if I come up with a solution.
Pavel Shved
@280Z28: That's not relevant: it's the *decision version* of the clique problem ("is there a clique of size of size ≥k?") that is NP-complete, so the equivalent here would be "does there exist a way of removing some rows and columns to get a sum ≥ s?" And given a certificate (which rows and columns to remove), this is trivial to check in polynomial time, so the problem is indeed in NP (and NP-complete).
ShreevatsaR
If the problem is NP-hard, then there really isn't a solution all that much better than brute force (at least, not one that wouldn't surprise a whole lot of people), which simplifies the answer considerably.
David Thornley
@ShreevatsaR, of course you can, since you surely know what's going on in the domain. You can mark it as a CW as well.
Pavel Shved
@David Thornley, solving NP-complete problems with heuristics (and the aptitude to pick one that's good enough and prove its efficiency) is a valuable skill even in commercial development.
Pavel Shved
@Pavel: Coming up with good approximations to solutions to NP-hard problems by whatever means is a valuable skill, and is often doable. Coming up with actual solutions to NP-hard problems would be an even more valuable skill, but it isn't doable unless P=NP. The problem statement was to find an algorithm that would maximize the sum, not a heuristic to find a large sum.
David Thornley
Oh, yeah....it's possible that proving the problem NP-hard would be the optimum answer, but an interview question that requests a proof of NP-completeness on the spot seems perhaps a bit rigorous.
David Thornley
@David: I thought about it a bit and then created a question about NP-interviews: http://stackoverflow.com/questions/1724673/is-it-correct-to-ask-to-solve-an-np-complete-problem-on-a-job-interview
Pavel Shved
@Pavel: I made some gratuitous changes, but feel free to revert whatever you feel you preferred earlier. :-)
ShreevatsaR
@ShreevatsaR, awesome! And seems correct. I won't change anything, because, well, the reason why I came to SO was actually to learn how to word my thoughts better. But after two months I still learned nothing...
Pavel Shved
Good work. By the way, even approximation algorithms won't help much - there's no polynomial time approximation scheme for this problem, unless P=NP. (one can embed MAX-3SAT using similar trick, so this is APX-complete.)
sdcvvc
Note that as described it's still a bit more complicated than necessary. You can use -2 instead of -n-1. Any row or column containing a -2 must have a negative sum, since the only positive number is the 1 on the diagonal. Thus, removing such rows and columns increases the overall sum. (And then claim 2 isn't really needed.)
RD1
@RD1, hm, seems you're right about `-2`. But claim 2 is about other things: it's important to prove we can build square sub-matrix, because only square sub-matrices map to graphs.
Pavel Shved
@Pavel Yes, you're right, I spoke too soon.I'd replace Claim 1 with: Any optimal solution has the reduced matrix equal to a diagonal matrix with 0 and 1 entries.Proof: including any -2 means the row and column containing it have a negative sum, since their only positive entry is the 1 on the diagonal. This contradicts optimality. This makes Claim 2 more obvious, but yes, still worth stating.
RD1
+2  A: 

To try it in a simple way:

We need the valid subset of the set of entries {A00, A01, A02, ..., A0n, A10, ...,Ann} which max. sum.

First compute all subsets (the power set).

A valid subset is a member of the power set that for each two contained entries Aij and A(i+x)(j+y), contains also the elements A(i+x)j and Ai(j+y) (which are the remaining corners of the rectangle spanned by Aij and A(i+x)(j+y)).

Aij ...
 .       .   
 .       .
    ... A(i+x)(j+y)

By that you can eliminate the invalid ones from the power set and find the one with the biggest sum in the remaining.

I'm sure it can be improved by improving an algorithm for power set generation in order to generate only valid subsets and by that avoiding step 2 (adjusting the power set).

Kai
Ouch. Not only is this awful efficiency-wise, it's not even correct. (The rows and columns don't have to form contiguous rectangles.) Consider just the efficiency: this goes through 2^(n^2) subsets (without the "improvement" part), which is very quickly much larger than 2^(2n). Even for 6x6 matrices, can you imagine going through all 2^36 subsets of the 36 entries? Compare the 64x64=4096 (subsets of rows)x(subsets of columns).
ShreevatsaR
No it's not efficient, but it still correct. I don't say he've to form *contiguous* rectangles, all I'm saying is if two opposed corners of a rectangle are kept in a subset, the remaining corners have to be kept too.
Kai
And for efficency: Still don't get your argument. Where is your algorithm working only with the comparison of the 4096 subsets of the initial matrix?
Kai
Take every subset X of the rows (there are 2^6=64 of them), and every subset Y of the columns (there are 2^6=64 of them), and look at the sum of entries (x,y) for x in X, y in Y. That's the more direct approach too, don't you think?
ShreevatsaR
@ShreevatsaR: And if you got the sums, what's next? Different examples in comments to different approaches here show, that you're never done until you compared all valid subsets of entries, i.e. subsets that can be contructed by removing rows and columns. The only question is an efficient way to calculate those subsets. With my approach there are ways to make a greedy: Let's say a subset is represented by a binary where each bit shows if the element is in or not. For a 3x3 matrix the first is 000000001 (only the element in the lower right corner remains). The first invalid is 000001111. ->
Kai
I don't understand what the confusion is. By generating all subsets of rows and all subsets of columns, you *are* looking at all valid subsets of entries, so nothing is missed. Your approach (after your clarification) also generates all valid subsets, but it does incredibly more work: it's a question of 2^(2n) v/s 2^(n^2).
ShreevatsaR
-> From here no change in the upper 4 bits can heal that (xxxx01111 is invalid for every x in {0,1}. Meaning you can leave 2^4 subsets behind. So, whenever you come to an invalid one during your search you can leave of subsets. By that, if you provide the right (elegant) tree search you can increase effiency a lot.
Kai
@ShreevatsaR: How do you create all valid subsets and how do you now which one's best without comparing them (but only rS and cS)?
Kai
Of course you compare them! `best=0; for each subset X of the rows: for each subset Y of the columns: best = max(best, sum of elements in rows X and columns Y)`. See the answer of cletus, for example. (From your other comments, it seems that you're trying to discuss the *best-case* running time of your algorithm; I refuse to be drawn into such absurdity. :p Either worst-case running time, or average-case running with an analysis of the input distribution, are worth discussing.)
ShreevatsaR
If you don't want to get drawn into something you should be a little more careful with your ouches and it's-wrongs and a little more specific in your explanations (referring earlier to cletus' answer would have helped). And to be nice again: You did an impressive proof by modifying Pavel's idea.
Kai
Thanks. In my defence, I wasn't trying to be rude; I find the 2^(n^2) staggeringly large (it is impossible even for n=7!) so the "ouch" was a literal physiological reaction on reading this. :-) Just for future reference, what was missing from the first explanation? Was it the word "compare"?
ShreevatsaR
No, I just didn't get it ... When I looked at cletus answer again I realized that those subsets (rows x columns) a simply the valid subsets constructed in a much more efficient way. But I'm also sure that by using my rule above it is possible as well to find a way (algorithm) to construct the valid ones efficiently.
Kai
BTW: It wasn't really rude, it was just the goad that it made it impossible for me to stop arguing (and maybe disabled my senses for the obvious...)
Kai
+2  A: 

Since nobody asked for an efficient algorithm, use brute force: generate every possible matrix that can be created by removing rows and/or columns from the original matrix, choose the best one. A slightly more efficent version, which most likely can be proved to still be correct, is to generate only those variants where the removed rows and columns contain at least one negative value.

ammoQ
+3  A: 

We can improve on Cletus's generalized brute-force solution by modelling this as a directed graph. The initial matrix is the start node of the graph; its leaves are all the matrices missing one row or column, and so forth. It's a graph rather than a tree, because the node for the matrix without both the first column and row will have two parents - the nodes with just the first column or row missing.

We can optimize our solution by turning the graph into a tree: There's never any point exploring a submatrix with a column or row deleted that comes before the one we deleted to get to the current node, as that submatrix will be arrived at anyway.

This is still a brute-force search, of course - but we've eliminated the duplicate cases where we remove the same rows in different orders.

Here's an example implementation in Python:

def maximize_sum(m):
  frontier = [(m, 0, False)]
  best = None
  best_score = 0

  while frontier:
    current, startidx, cols_done = frontier.pop()
    score = matrix_sum(current)
    if score > best_score or not best:
      best = current
      best_score = score
    w, h = matrix_size(current)
    if not cols_done:
      for x in range(startidx, w):
        frontier.append((delete_column(current, x), x, False))
      startidx = 0
    for y in range(startidx, h):
      frontier.append((delete_row(current, y), y, True))
  return best_score, best

And here's the output on 280Z28's example matrix:

>>> m = ((1, 1, 3), (1, -89, 101), (1, 102, -99))
>>> maximize_sum(m)
(106, [(1, 3), (1, 101)])
Nick Johnson
I think you can first decrease in a step and than later on that graph you come to a solution. Imagine ((-3,10),(11,-8))=10 move some row or column and you will get lower (8 or 7 or 3 or 2), but you have to go, because the solution is 11.
Kai
@kai1968: that is completely true. No greedy algorithm (or heuristic thereof) will provide a maximal answer. That is why dynamic programming / memoization techniques.
David Rodríguez - dribeas
Good counter-example. I removed my claim of hill-climbing. Turns out I forgot to build it into my example code in the first place, so that's fine. ;)
Nick Johnson
+1  A: 

I think there are some angles of attack that might improve upon brute force.

  1. memoization, since there are many distinct sequences of edits that will arrive at the same submatrix.
  2. dynamic programming. Because the search space of matrices is highly redundant, my intuition is that there would be a DP formulation that can save a lot of repeated work
  3. I think there's a heuristic approach, but I can't quite nail it down:

    if there's one negative number, you can either take the matrix as it is, remove the column of the negative number, or remove its row; I don't think any other "moves" result in a higher sum. For two negative numbers, your options are: remove neither, remove one, remove the other, or remove both (where the act of removal is either by axing the row or the column).

    Now suppose the matrix has only one positive number and the rest are all <=0. You clearly want to remove everything but the positive entry. For a matrix with only 2 positive entries and the rest <= 0, the options are: do nothing, whittle down to one, whittle down to the other, or whittle down to both (resulting in a 1x2, 2x1, or 2x2 matrix).

    In general this last option falls apart (imagine a matrix with 50 positives & 50 negatives), but depending on your data (few negatives or few positives) it could provide a shortcut.

Ron
Sometimes the maximum still contains negatives ((-1,2),(2,1))
Kai
I agree (that's why I admit the 2x2 ultimate matrix for the case where there are only 2 positive entries)
Ron
A: 

Big Edit: I honestly don't think there's a way to assess a matrix and determine it is maximized, unless it is completely positive.

Maybe it needs to branch, and fathom all elimination paths. You never no when a costly elimination will enable a number of better eliminations later. We can short circuit if it's found the theoretical maximum, but other than any algorithm would have to be able to step forward and back. I've adapted my original solution to achieve this behaviour with recursion.

Double Secret Edit: It would also make great strides to reduce to complexity if each iteration didn't need to find all negative elements. Considering that they don't change much between calls, it makes more sense to just pass their positions to the next iteration.

Takes a matrix, the list of current negative elements in the matrix, and the theoretical maximum of the initial matrix. Returns the matrix's maximum sum and the list of moves required to get there. In my mind move list contains a list of moves denoting the row/column removed from the result of the previous operation.

Ie: r1,r1

Would translate

-1  1  0           1  1  1
-4  1 -4           5  7 1
 1  2  4    ===>  
 5  7  1
  1. Return if sum of matrix is the theoretical maximum

  2. Find the positions of all negative elements unless an empty set was passed in.

  3. Compute sum of matrix and store it along side an empty move list.

  4. For negative each element:

    1. Calculate the sum of that element's row and column.

    2. clone the matrix and eliminate which ever collection has the minimum sum (row/column) from that clone, note that action as a move list.

    3. clone the list of negative elements and remove any that are effected by the action taken in the previous step.

    4. Recursively call this algorithm providing the cloned matrix, the updated negative element list and the theoretical maximum. Append the moves list returned to the move list for the action that produced the matrix passed to the recursive call.

    5. If the returned value of the recursive call is greater than the stored sum, replace it and store the returned move list.

  5. Return the stored sum and move list.

I'm not sure if it's better or worse than the brute force method, but it handles all the test cases now. Even those where the maximum contains negative values.

EmFi
See 280Z28's answer for an example of a matrix with no negative rows or columns that is nevertheless not maximized.
Nick Johnson
Doesn' work for ((-3,10,0),(11,-5,-8),(0,-6,0)). You end up with the 10, but 11 is the solution.
Kai
And it has different solutions for different orders of the array of negstives.
Kai
I forgot about the sort, and optimized the solution. My algorithm now finds the correct answer for those problem cases.
EmFi
((-5,0,2),(0, -10, 100), (2, 100, 0)) still gives the wrong answer with your updated algorithm. Removing the most negative element is not always the solution.
Nick Johnson
Sometimes the maximum still contains negatives ((-1,2),(2,1))
Kai
I was wrong with my first example - the solution is not 11 its ((-3,10),(11,-5))
Kai
@Nick, @kai, et al: I've come to the conclusion that it's got to branch, and redefined the algorithm to handle those cases with recursion.
EmFi
A: 

Take each row and each column and compute the sum. For a 2x2 matrix this will be:

2    1

3    -10

Row(0) = 3 Row(1) = -7 Col(0) = 5 Col(1) = -9

Compose a new matrix

Cost to take row      Cost to take column
       3                      5

      -7                      -9

Take out whatever you need to, then start again.

You just look for negative values on the new matrix. Those are values that actually substract from the overall matrix value. It terminates when there're no more negative "SUMS" values to take out (therefore all columns and rows SUM something to the final result)

In an nxn matrix that would be O(n^2)Log(n) I think

Jorge Córdoba
Didn't get it? Do I only take negative columns? When does it terminate?
Kai
Yeah, you just look for negative values on the new matrix. Those are values that actually substract from the overall matrix value. It terminates when there're no more negative "SUMS" to take.
Jorge Córdoba
The problem of this algorithm is in the hidden complexity in the sentence: *Take out whatever you need to, then start again*. For more complex situations the solution will require removing lines (row/columns) that impose a maximal cost, but in some other situations the optiomal solution requires that those lines are not removed (but some orthogonal lines that remove the most negative numbers
David Rodríguez - dribeas
Dribeas the algorithm removes the row or column that in sum provide the major benefit. As you can only remove entire rows or entire columns you can judge their weight by summing up how much are you taking away. On each pass you're taking away the row or column with the value or value that subtract the most to the entire matrix.
Jorge Córdoba
Oh, and when I say "take out what you need" that just getting the minimum value of the entire composed matrix (which is nx2) and taking out that column or row (ej: row 6, column 2 means remove column 6, row 4, column 1 means remove row 4
Jorge Córdoba
+1  A: 

Compute the sum of each row and column. This can be done in O(m) (where m = n^2)

While there are rows or columns that sum to negative remove the row or column that has the lowest sum that is less than zero. Then recompute the sum of each row/column.

The general idea is that as long as there is a row or a column that sums to nevative, removing it will result in a greater overall value. You need to remove them one at a time and recompute because in removing that one row/column you are affecting the sums of the other rows/columns and they may or may not have negative sums any more.

This will produce an optimally maximum result. Runtime is O(mn) or O(n^3)

James Conigliaro
Exactly what I was thinking. However, I'm not sure that I can prove that the greedy approach will lead to a globally optimal answer. +1 anyways for the mind-meld :)
Drew Hall
This is exactly the algorithm I posted - see my post for two cases where it doesn't work.
280Z28
Yeah, this won't work.
A: 

This is an optimization problem and can be solved approximately by an iterative algorithm based on simulated annealing:

Notation: C is number of columns.

For J iterations:

  1. Look at each column and compute the absolute benefit of toggling it (turn it off if it's currently on or turn it on if it's currently off). That gives you C values, e.g. -3, 1, 4. A greedy deterministic solution would just pick the last action (toggle the last column to get a benefit of 4) because it locally improves the objective. But that might lock us into a local optimum. Instead, we probabilistically pick one of the three actions, with probabilities proportional to the benefits. To do this, transform them into a probability distribution by putting them through a Sigmoid function and normalizing. (Or use exp() instead of sigmoid()?) So for -3, 1, 4 you get 0.05, 0.73, 0.98 from the sigmoid and 0.03, 0.42, 0.56 after normalizing. Now pick the action according to the probability distribution, e.g. toggle the last column with probability 0.56, toggle the second column with probability 0.42, or toggle the first column with the tiny probability 0.03.

  2. Do the same procedure for the rows, resulting in toggling one of the rows.

Iterate for J iterations until convergence.

We may also, in early iterations, make each of these probability distributions more uniform, so that we don't get locked into bad decisions early on. So we'd raise the unnormalized probabilities to a power 1/T, where T is high in early iterations and is slowly decreased until it approaches 0. For example, 0.05, 0.73, 0.98 from above, raised to 1/10 results in 0.74, 0.97, 1.0, which after normalization is 0.27, 0.36, 0.37 (so it's much more uniform than the original 0.05, 0.73, 0.98).

Note: This needs to be modified so that there is also the option of not toggling anything which results in a benefit of 0. So in the example, the benefits are really -3, 0, 1, 4.
A variant is to just pick one randomly selected column (row) per iteration and decide probabilistically if we toggle it or not.
A: 

It's clearly NP-Complete (as outlined above). Given this, if I had to propose the best algorithm I could for the problem:

  • Try some iterations of quadratic integer programming, formulating the problem as: SUM_ij a_ij x_i y_j, with the x_i and y_j variables constrained to be either 0 or 1. For some matrices I think this will find a solution quickly, for the hardest cases it would be no better than brute force (and not much would be).

  • In parallel (and using most of the CPU), use a approximate search algorithm to generate increasingly better solutions. Simulating Annealing was suggested in another answer, but having done research on similar combinatorial optimisation problems, my experience is that tabu search would find good solutions faster. This is probably close to optimal in terms of wandering between distinct "potentially better" solutions in the shortest time, if you use the trick of incrementally updating the costs of single changes (see my paper "Graph domination, tabu search and the football pool problem").

  • Use the best solution so far from the second above to steer the first by avoiding searching possibilities that have lower bounds worse than it.

Obviously this isn't guaranteed to find the maximal solution. But, it generally would when this is feasible, and it would provide a very good locally maximal solution otherwise. If someone had a practical situation requiring such optimisation, this is the solution that I'd think would work best.

Stopping at identifying that a problem is likely to be NP-Complete will not look good in a job interview! (Unless the job is in complexity theory, but even then I wouldn't.) You need to suggest good approaches - that is the point of a question like this. To see what you can come up with under pressure, because the real world often requires tackling such things.

RD1
A: 
function pruneMatrix(matrix) {
  max = -inf;
  bestRowBitField = null;
  bestColBitField = null;
  for(rowBitField=0; rowBitField<2^matrix.height; rowBitField++) {
    for (colBitField=0; colBitField<2^matrix.width; colBitField++) {
      sum = calcSum(matrix, rowBitField, colBitField);
      if (sum > max) {
        max = sum;
        bestRowBitField = rowBitField;
        bestColBitField = colBitField;
      }
    }
  }
  return removeFieldsFromMatrix(bestRowBitField, bestColBitField);
}

function calcSumForCombination(matrix, rowBitField, colBitField) {
  sum = 0;
  for(i=0; i<matrix.height; i++) {
    for(j=0; j<matrix.width; j++) {
      if (rowBitField & 1<<i && colBitField & 1<<j) {
        sum += matrix[i][j];
      }
    }
  }
  return sum;
}
bsdfghsrh
A: 

yes, it's NP-complete problem.

It's hard to easily find the best sub-matrix,but we can easily to find some better sub-matrix.

Assume that we give m random points in the matrix as "feeds". then let them to automatically extend by the rules like :

if add one new row or column to the feed-matrix, ensure that the sum will be incrementive.

,then we can compare m sub-matrix to find the best one.

ken