ansaurus

Question

How to do Variable Substitution with Flex/Lex and Yacc/Bison

Answer 1

+1 A:

Looks generally OK

I'm not sure what return_value is doing, hopefully it will strdup(3) the variable name, because yytext is just a buffer.

If you are asking about the division of labor between lex and parse, I'm sure it's perfectly reasonable to push the macro processing and parameter substitution into the scanner and just have your grammar deal with WORDs, lists, commands, pipelines, redirections, etc. After all, it would be reasonable enough, albeit kind of out of style and possibly defeating the point of your exercise, to do everything with code.

I do think that making cd or chdir a terminal symbol and using that in a grammar production is...not the best design decision. Just because a command is a built-in doesn't mean it should appear as a rule. Go ahead and parse cd and chdir like any other command. Check for built-in semantics as an action, not a production.

After all, what if it's redefined as a shell procedure?

DigitalRoss 2009-09-19 19:06:47

Answer 2

+2 A:

The way 'traditional' shells deal with things like variable substitution is difficult to handle with lex/yacc. What they do is more like macro expansion, where AFTER expanding a variable, they then re-tokenize the input, without expanding further variables. So for example, an input like "xx${$foo}" where 'foo' is defined as 'bar' and 'bar' is defined as '$y' will expand to 'xx$y' which will be treated as a single word (and $y will NOT be expanded).

You CAN deal with this in flex, but you need a lot of supporting code. You need to use flex's yy_buffer_state stuff to sometimes redirect the output into a buffer that you'll then rescan from, and use start states carefully to control when variables can and can't be expanded.

Its probably easier to use a very simple lexer that returns tokens like ALPHA (one or more alphabetic chars), NUMERIC (one or more digits), or WHITESPACE (one or more space or tab), and have the parser assemble them appropriately, and you end up with rules like:

simple_command: wordlist NEWLINE ;

wordlist: word | wordlist WHITESPACE word ;

word: word_frag
    | word word_frag { $$ = concat_string($1, $2); }
;

word_frag: single_quote_string
         | double_quote_string
         | variable
         | ALPHA
         | NUMERIC
        ...more options...
;

variable: '$' name { $$ = lookup($2); }
        | '$' '{' word '}' { $$ = lookup($3); }
        | '$' '{' word ':' ....

as you can see, this get complex quite fast.

Chris Dodd 2009-09-21 17:43:55

ansaurus

tags:

views:

answers:

How to do Variable Substitution with Flex/Lex and Yacc/Bison

Looks generally OK

related questions