tags:

views:

34

answers:

1

I'v been given the following yacc file. How do I make a parser out of it? Do I have to make a scanner first?

/* C-Minus BNF Grammar */

%token ELSE
%token IF
%token INT
%token RETURN
%token VOID
%token WHILE

%token ID
%token NUM

%token LTE
%token GTE
%token EQUAL
%token NOTEQUAL
%%

program : declaration_list ;

declaration_list : declaration_list declaration | declaration ;

declaration : var_declaration | fun_declaration ;

var_declaration : type_specifier ID ';'
                | type_specifier ID '[' NUM ']' ';' ;

type_specifier : INT | VOID ;

fun_declaration : type_specifier ID '(' params ')' compound_stmt ;

params : param_list | VOID ;

param_list : param_list ',' param
           | param ;

param : type_specifier ID | type_specifier ID '[' ']' ;

compound_stmt : '{' local_declarations statement_list '}' ;

local_declarations : local_declarations var_declaration
                   | /* empty */ ;

statement_list : statement_list statement
               | /* empty */ ;

statement : expression_stmt
          | compound_stmt
          | selection_stmt
          | iteration_stmt
          | return_stmt ;

expression_stmt : expression ';'
                | ';' ;

selection_stmt : IF '(' expression ')' statement
               | IF '(' expression ')' statement ELSE statement ;

iteration_stmt : WHILE '(' expression ')' statement ;

return_stmt : RETURN ';' | RETURN expression ';' ;

expression : var '=' expression | simple_expression ;

var : ID | ID '[' expression ']' ;

simple_expression : additive_expression relop additive_expression
                  | additive_expression ;

relop : LTE | '<' | '>' | GTE | EQUAL | NOTEQUAL ;

additive_expression : additive_expression addop term | term ;

addop : '+' | '-' ;

term : term mulop factor | factor ;

mulop : '*' | '/' ;

factor : '(' expression ')' | var | call | NUM ;

call : ID '(' args ')' ;

args : arg_list | /* empty */ ;

arg_list : arg_list ',' expression | expression ;
+2  A: 

You`ll need a scanner indeed (also called tokenizer or lexer by some). That is typically done with lex if you are using yacc, or flex (for the .l file) if you are using bison (for the .y file).

The scanner needs to output all the different tokens that are defined in your yacc file (the %token directives as well as the stuff in single quotes).

To get you started, you`d add the following to a .l file and run flex on it, then have a look at the headers and whatnot for information purposes, then compile the .c outputs.

%{
#include "y.tab.h"
// other stuff for your program
%}

%%

else return ELSE;  // correspond to the %token's in your yacc file
if   return IF;
int  return INT;
// other ones at the top of your yacc file

%%

// any c-code helper functions to do fancy scanning.

This is a pretty simple example, it pretty much always gets more complicated, but should get you started.

For a tutorial, see http://dinosaur.compilertools.net/

Have fun!

tim
Where does y.tab.h come from? Is it automatically generated by something?
Phenom
Yep, generated by yacc.
tim