Hi,
I am trying to do a sintax text corrector for my compilers' class. The idea is: I have some rules, which are inherent to the language (in my case, Portuguese), like "A valid phrase is SUBJECT VERB ADJECTIVE", as in "Ruby is great".
Ok, so first I have to tokenize the input "Ruby is great". So I have a text file "verbs", with a lot of verbs, one by line. Then I have one text "adjectives", one "pronouns", etc.
I am trying to use Ragel to create a parser, but I don't know how I could do something like:
%%{
machine test;
subject = <open-the-subjects-file-and-accept-each-one-of-them>;
verb = <open-the-verbs-file-and-accept-each-one-of-them>;
adjective = <open-the-adjective-file-and-accept-each-one-of-them>;
main = subject verb adjective @ { print "Valid phrase!" } ;
}%%
I looked at ANTLR, Lex/Yacc, Ragel, etc. But couldn't find one that seemed to solve this problem. The only way to do this that I could think of was to preprocess Ragel's input file, so that my program reads the file and writes its contents at the right place. But I don't like this solution either.
Does anyone knows how I could do this? There's no problem if it isn't with Ragel, I just want to solve this problem. I would like to use Ruby or Python, but that's not really necessary either.
Thanks.