A first thought. You can determin the length n
of the string in O(log2(n))
.
- Check
Z*
where Z
represents k
question marks starting with 0, then 1, and then doubling the number of question marks with every check until no match occurs. n
must be between k / 2
and k
- Find the exact length using the same pattern changing
k
in the same way as binary search does.
Knowing the exact length might help to perform a kind of divide-et-impera in the spatial domain.
UPDATE
If you know the length, you can use the same pattern to correctly locate a symbol.
Example:
..X. ..XX (spaces added for readability)
+ symbol may be X
- symbol is not X
X symbol is X
*X* => MATCH ++++ ++++
*X* ???? => MATCH ++++ ++++
*X*?? ???? => NO MATCH --++ ++++
??X? ???? => MATCH --X+ ++++
??XX ???? => NO MATCH --X- ++++
??X? *X*?? => NO MATCH --X- --++
??X? ??X? => MATCH --X- --X+
??X? ??XX => MATCH --X- --XX
For string length n
and alphabet size m
this will take about O(log2(n))
to find the length of the string, about O(n • log2(n))
to correctly place n
symbols, and O(m)
to find the used symbols - summing all together yields O(n • log2(n) + m)
.
I could imagine that it is possible to speed this up by merging several steps - maybe test for used symbols while determining the string length or simultaneously locating two (or even more?) symbols in the first and second half of the string. This will require to recheck the merged steps in isolation if the check fails in order to determine which check faild. But as long as the merged check succeeds, you gain information on both.
Maybe I will calculate that tomorrow in order to see if it will really speed the thing up.