EDIT: Wow, many great responses. Yes, I am using this as a fitness function for judging the quality of a sort performed by a genetic algorithm. So cost-of-evaluation is important (i.e., it has to be fast, preferably O(n)
.)
As part of an AI application I am toying with, I'd like to be able to rate a candidate array of integers based on its monotonicity, aka its "sortedness". At the moment, I'm using a heuristic that calculates the longest sorted run, and then divides that by the length of the array:
public double monotonicity(int[] array) {
if (array.length == 0) return 1d;
int longestRun = longestSortedRun(array);
return (double) longestRun / (double) array.length;
}
public int longestSortedRun(int[] array) {
if (array.length == 0) return 0;
int longestRun = 1;
int currentRun = 1;
for (int i = 1; i < array.length; i++) {
if (array[i] >= array[i - 1]) {
currentRun++;
} else {
currentRun = 1;
}
if (currentRun > longestRun) longestRun = currentRun;
}
return longestRun;
}
This is a good start, but it fails to take into account the possibility that there may be "clumps" of sorted sub-sequences. E.g.:
{ 4, 5, 6, 0, 1, 2, 3, 7, 8, 9}
This array is partitioned into three sorted sub-sequences. My algorithm will rate it as only 40% sorted, but intuitively, it should get a higher score than that. Is there a standard algorithm for this sort of thing?