Hi,
I'm developing a software to generate a Turing Machine from a regular expression.
[ EDIT: To clarify, the OP wants to take a regular expression as input, and programmatically generate a Turing Machine to perform the same task. OP is seeking to perform the task of creating a TM from a regular expression, not using a regular expression. ]
First I'll explain a bit what I have done and then what is my specific problem:
I've modeled the regular expression as follows:
RegularExpression (interface): the classes below implements this interface
Simple (ie: "aaa","bbb","abcde"): this is a leaf class it does not have any subexpressions
ComplexWithoutOr (ie: "a(ab)*","(a(ab)c(b))*"): this class contains a list of RegularExpression.
ComplexWithOr (ie: "a(a|b)","(a((ab)|c(b))"): this class contains an Or operation, which contains a list of RegularExpression. It represents the "a|b" part of the first example and the "(ab)|c(b)" of the second one.
Variable (ie: "awcw", where w E {a,b}*): this is not yet implemented, but the idea is to model it as a leaf class with some different logic from Simple. It represents the "w" part of the examples.
It is important that you understand and agree with the model above. If you have questions make a comment, before continue reading...
When it comes to MT generation, I have different levels of complexity:
Simple: this type of expression is already working. Generates a new state for each letter and moves right. If in any state, the letter read is not the expected, it starts a "rollback circuit" that finishes with the MT head in the initial position and stops in a not final state.
ComplexWithoutOr: here it comes my problem. Here, the algorithm generates an MT for each subexpression and concat them. This work for some simple cases, but I have problems with the rollback mechanism.
Here is an example that does not work with my algorithm:
"(ab)abac": this is a ComplexWithoutOr expression that contains a ComplexWithOr expression "(ab)" (that has a Simple expression inside "ab") and a Simple expression "abac"
My algorithm generates first an MT1 for "ab". This MT1 is used by the MT2 for "(ab)*", so if MT1 succeed it enters again in MT1, otherwise MT1 rollbacks and MT2 finishes right. In other words, MT2 cannot fail.
Then, it generates an MT3 for "abac". The output of MT2 it is the input of MT3. The output of MT3 is the result of the execution
Now, let suppose this input string: "abac". As you can see it matches with the regular expression. So let see what happens when the MT is executed.
MT1 is executed right the first time "ab". MT1 fails the second time "ac" and rollback, putting the MT head in the 3rd position "a". MT2 finishes right and input is forwarded to MT3. MT3 fails, because "ac"!="abac". So MT does not recognize "abac".
Do you understand the problem? Do you know any solution for this?
I'm using Java to develop it, but the language it is not important, I'd like to discuss the algorithm.