tags:

views:

90

answers:

2

I have a two-dimensional array of ints, for example:

int [][] board=
{
{23,17,3,29,12,10},
{17,4,11,12,10,19},
{32,33,25,25,28,35},
{27,29,24,25,23,37},
{29,40,34,26,24,39},
{23,37,29,36,31,3}
}

I don't want to change the columns of this array at all; however, I would like to swap the rows so that the most similar rows are grouped together. Similar in this case means most number of equal elements.

Edit: Similar rows means, if one row has 1,2,3,4,5,6 and another has 1,2,3,4,9,10 They have 4 similarities.

What's the best way to do this?

Note: the most number of rows I will have in my array is around 100 and the most number of elements in each row will be 10 so the complexity does matter as pointed out!

+5  A: 

This question reduces to the traveling salesman problem. If you think of each row as being a city and then define some distance function which computes the distance between two rows. The question is how to order the rows so that the distance is minimized. This problem is NP-Complete and cannot be solved in a reasonable amount of time for 100 rows. The brute-force solution for this would require O(N!) computations. There are heuristic algorithms (algorithms that get close to the best answer) that will solve this in a reasonable time.

Traveling Salesman Problem (Wikipedia)

One example is to use a greedy algorithm. Choose one row at random, this is row 1. Then choose the closest row to row 1 as row 2. Then choose the closest row to row 2 as row 3. Run until all the rows are chosen. This is not a very optimal solution.

Jon Snyder
Can you provide one such heuristic example algorithm?
Sev
I would like to point out that it is not the case that "the problem cannot be solved in a reasonable amount of time" -- rather *we don't know whether this can be solved in a reasonable amount of time or not*
Sev
Looks like this is the best answer I'm going to get here. Thanks.
Sev
Reading up on the wikipedia page, it looks like "branch and bound" type solutions could work in this situation and provide a solution within a reasonable amount of time.
Jon Snyder
@Sev: good point, but it boils down to "cannot be solved in a reasonable amount of time *unless you're smarter than all the other mathematicians in the world so far*". :-) @Jon, +1 for reducing to Traveling Salesman.
LarsH
@LarsH: true, but we must get our english verbiage correct or else we convey the wrong information :) -- saying that it cannot be solved, gives an impression of absolution.
Sev
A: 

I'm not an expert on proving algorithms, but I'll take a shot at helping. Also, I haven't tested this solution or given it more than 15 minutes of thought, but I think it will work or at least get you close. Remember heuristic algorithms are not 100% correct :) I'll take the risk of being down below 3K:

Sort each row, so the table you pasted after sorting looks like:

3,  10, 12, 17, 23, 29
4,  9,  10, 11, 12, 17
25, 25, 28, 32, 33, 35
23, 24, 25, 27, 29, 37
24, 26, 29, 34, 39, 40
3,  23, 29, 31, 36, 37

Now sort each row by the values in the first column of each row, so the result looks like:

3,  10, 12, 17, 23, 29
3,  23, 29, 31, 36, 37
4,  9,  10, 11, 12, 17
23, 24, 25, 27, 29, 37
24, 26, 29, 34, 39, 40
25, 25, 28, 32, 33, 35
Amir Afghani
Unfortunately, if you sort the row values, you are modifying the relative positions.
Sev
Sev, you did you see my prior comments about doing this on a copy table? And did you understand it? Please answer yes/no so I can respond.
Amir Afghani
You would need to map the indexes of each row.
Amir Afghani
So we'd have Table 1, and Table 1 Copy. You would do the first sort on Table 1 Copy. Then when you perform the second sort on Table 1 Copy you need to map the indices that moved back to Table 1.
Amir Afghani
Uh - oh kay. You should read why the run time complexity (not to be confused with how hard an algorithm is to understand) of the TSP is considered NP Hard.
Amir Afghani
Sev, with all due respect, if this is an assignment you're best off thinking it through. This is a tough problem, but I'm sure you can handle it.
Amir Afghani
Thanks again for the help.
Sev