views:

203

answers:

2

Hello all :)

I've been looking to recognise a language which does not fit the general Flex/Bison paradigm. It has completely different token rules depending on semantic context. For example:

main() {
    batchblock
    {
echo Hello World!
set batchvar=Something
echo %batchvar%
    }
}

Bison apparently supports recognition of these types of grammars, but it needs "Lexical Tie Ins" to support them effectively. It provides an interface for doing this -- but I'm confused as to how exactly I can supply different flex regexes depending on the context -- if this is even possible.

Thanks in advance :)

+2  A: 
I'm confused as to how exactly I can supply different flex regexes depending on the context

Flex has a state mechanism whereby you can switch it between different sets of regexes. The syntax for this is

%x name_of_state

at the top of the file (after the %}) and in your matching rules (after the first %%)

<name_of_state> *regex goes here*

Then this regex is only matched when in that state. There is also a global state <*> which can be used to match anything in any state.

There is more than one way to change states. For example, yy_pop_state and yy_push_state if you want to keep a stack of states. Alternatively you can use BEGIN(name_of_state). To go back to the initial state, use BEGIN(INITIAL).

Kinopiko
Both good answers. Checking this one because it was faster.
Billy ONeal
+2  A: 

As it stands, specifically, if your special block is consistently signaled by 'batchblock {', this can be handled entirely inside of flex -- on the Bison (or byacc, if you want to make your life at least a little easier) side, you'd just see tokens that changed to something like 'BATCH_ECHO'.

To handle it inside of flex, you'd use its start conditions capability:

%x batchblock
%%

"batchblock"{ws}\{   { BEGIN(batchblock); }


<batchblock>echo     { return BATCH_ECHO; }
<batchblock>set      { return BATCH_SET;  }
/* ... */
<batchblock>\}       { begin(INITIAL);    }

The patterns that start with <batchblock> can only match in the "batchblock" state, which is entered by the BEGIN(batchblock);.

Jerry Coffin
Here you need to put the `<batchblock>` between backticks otherwise it looks like 'The patterns that start with *""*'.
Kinopiko
Thanks -- I believe I've got it fixed.
Jerry Coffin