Idea
A greedy approach is the way to go:
- If the current text is empty, you're done.
- Take the first N characters. If any of them is a digit then this is a new substring. Chop it off and go to beginning.
- Otherwise, extend the digitless segment to at most M characters. This is a new substring. Chop it off and go to beginning.
Proof
Here's a reductio-ad-absurdum proof that the above yields an optimal solution.
Assume there is a better split than the greedy split. Let's skip to the point where the two splits start to differ and remove everything before this point.
Case 1) A digit among the first N characters.
Assume that there is an input for which chopping off the first N characters cannot yield an optimal solution.
Greedy split: |--N--|...
A better split: |---|--...
^
+---- this segment can be shortened from the left side
However, the second segment of the putative better solution can be always shortened from the left side, and the first one extended to N characters, without altering the number of segments. Therefore, a contradiction: this split is not better than the greedy split.
Case 2) No digit among the first K (N < K <= M) characters.
Assume that there is an input for which chopping off the first K characters cannot yield an optimal solution.
Greedy split: |--K--|...
A better split: |---|--...
^
+---- this segment can be shortened from the left side
Again, the the "better" split can be transformed, without altering the number of segments, to the greedy split, which contradicts the initial assumption that there is a better split than the greedy split.
Therefore, the greedy split is optimal. Q.E.D.
Implementation (Python)
import sys
m, n, text = int(sys.argv[1]), int(sys.argv[2]), sys.argv[3]
textLen, isDigit = len(text), [c in '0123456789' for c in text]
chunks, i, j = [], 0, 0
while j < textLen:
i, j = j, min(textLen, j + n)
if not any(isDigit[i:j]):
while j < textLen and j - i < m and not isDigit[j]:
j += 1
chunks += [text[i:j]]
print chunks
Implementation (Java)
public class SO {
public List<String> go(int m, int n, String text) {
if (text == null)
return Collections.emptyList();
List<String> chunks = new ArrayList<String>();
int i = 0;
int j = 0;
while (j < text.length()) {
i = j;
j = Math.min(text.length(), j + n);
boolean ok = true;
for (int k = i; k < j; k++)
if (Character.isDigit(text.charAt(k))) {
ok = false;
break;
}
if (ok)
while (j < text.length() && j - i < m && !Character.isDigit(text.charAt(j)))
j++;
chunks.add(text.substring(i, j));
}
return chunks;
}
@Test
public void testIt() {
Assert.assertEquals(
Arrays.asList("asdas", "d332", "4asd", "fsdxf", "23"),
go(5, 4, "asdasd3324asdfsdxf23"));
}
}