I'm looking for an algorithm to segment a sequence of positive numbers into n subsequences, such that the standard deviation of the sum of the numbers in each subset is minimized.
The ordering of the numbers in each subsequence needs to be the same as the ordering in the original sequence
For example:
Suppose I have a sequence {1,1,1,1,1,1,10,1} that i wanted to segment into 2 subsequences.
I believe the optimal solution would be {1,1,1,1,1,1}, {10,1} .
The sum of the 1st subsequence is 6, the sum of the 2nd subsequence is 11
The standard deviation of the two numbers is ~3.5, which i believe is the lowest possible.
Suppose I have a sequence {4,1,1,1,1,6} that i wanted to segment into 3 subsequences.
I believe the optimal solution would be {4}, {1,1,1,1}, {6}
The sum of the subsequences is 4, 4, and 6.
The standard deviation of the 3 numbers is ~1.15, which i believe is the lowest possible.
The best algorithm i was able to come up with was to find the cumulative sum of each of the numbers in the sequence, and segment the sequence at each interval of [totalSum/numSubsequences].
For example, given the sequence {4,1,1,1,1,6} , the cumulative sums of the numbers of each sequence is {4,5,6,7,8,14}. The total of all numbers in the sequence is 14, so, given that i want 3 subsequences, i should segment the sequence when the total reaches 14/3 = 4.66 and 2 * 14/3 = 9.333333.
However, there is no actual place in the sequence where the cumulative total is equal to 4.66 - the first cumulative total is 4, and next cumulative total is 5. So should i round up or should i round down? In this case, rounding down to 4 gives the optimal solution, but that isn't always the case. The best I can think of is to try every combination of rounding up and down, but that results in O(2^numSubsequences) complexity.
This seems to be the type of thing that would have a preexisting algorithm to apply, however my Googling has failed me. I am aware of the Partition Problem, which is NP-complete, but that deals with unordered sets, and not ordered sequences.
Any help would be appreciated.