views:

190

answers:

3

Hello, I'm writing a compiler for a shading engine and every worked fine until I reached the statements parsing part.

I used an abstract syntax tree defined with classes to do all the work (to simplify typechecking and intermediate code generation).. so I have an ancestor class ASTNode and all descending classes like ASTFloat, ASTExpression, ASTIdentifier and so on..

In .y file I'm able to build up the AST in the common way:

nexp:
T_LPAR nexp T_RPAR { $$ = $2; }
| nexp OP_PLUS nexp { $$ = new ASTBExpression('+', (ASTExpression*)$1, (ASTExpression*)$3); }
| nexp OP_MINUS nexp { $$ = new ASTBExpression('-', (ASTExpression*)$1, (ASTExpression*)$3); }
| nexp OP_TIMES nexp { $$ = new ASTBExpression('*', (ASTExpression*)$1, (ASTExpression*)$3); }

and it works quite fine but then I tried to generate statements of a scope (for example the body of an if statement) in this way: I have used a class ASTStatements which has a list of ASTNode* that must be filled by parser with each statement encountered.

So the approach would be something similar to this:

statements:
statement { if ($$ == null) $$ = new ASTStatements(); ((ASTStatements*)$$)->addStatement($1); } statements { $$->generateASM(); }
;

The problem is that the item should be initialized just once per block of statements but I don't know how to do it. Using if ($$ == null) is a hack I tried but it doesn't work because yylval can contain whatever up to that point.

Which is the normal/best way to handle this kind of situations using Bison?

A: 

Try an augmented grammar like the following:

statements: statement { $$ = new ASTStatements();
                       ((ASTStatements*)$$)->addStatement($1); }      
 | statements statement { ((ASTStatements*)$$)->addStatement($2); }

Not sure if this will help.

Sean A.O. Harney
+1  A: 

There are various reasons to prefer left-recursive rules for yacc, for one thing you can then reduce as early as possible in the input.

In any case, when you do that, you can then use a pattern like this:

statements:                { $$ = new ... }
    | statements statement { /* now $1 and $2 do just what you want */ }
    ;
DigitalRoss
A: 

I solved this issue by generating not a list of statements but a degenerated tree. So the class object involved is:

ASTStatements
{
    ASTStatements *m_next;
    ASTStatement *m_statement;

    ....

    public:
        ASTStatements(ASTStatement *statement) // used for last one
        ASTStatements(ASTStatement *stat, ASTStatements *next) // used with a next one
}

using rules in .y in the following way:

statements: /* empty */ { $$ = null; }
| statements statement { if ($1 == null) $$ = new ASTStatements($2); else $$ = new ASTStatements($2, (ASTStatements*)$1); }

And indeed this is left-recursive, allowing statements to be reduced as soon as possible without cluttering the stack. I followed the same approach for any other kind of "list of symbols" involved in my language..

Jack