ansaurus

Question

Answer 1

A:

I think you need to move the check for whether an ID is a TYPEID from c_lexer.py to c_parser.py.

As you said, since the parser is looking ahead 1 token, you can't make that decision in the lexer.

Instead, alter your parser to check ID's to see if they are TYPEID's in declarations, and, if they aren't, generate an error.

As Pax Diablo said in his excellent answer, the lexer/tokenizer's job isn't to make those kinds of decisions about tokens. That's the parser's job.

Mike G. 2008-09-20 12:31:30

Answer 2

+2 A:

Not sure why you're doing that level of analysis in your lexer.

Lexical analysis should probably be used to separate the input stream into lexical tokens (number, line-change, keyword and so on). It's the parsing phase that should be dojng that level of analysis, including table lookups for typedefs and such.

That's the way I've always separated duties between lexx and yacc, my tools of choice (too old to change :-).

paxdiablo 2008-09-20 12:47:23

I agree. Whenever I've had these types of problems it's usually because I'm trying to have the lexer do too much. But I use ANTLR now instead of LEX and YACC (**not** too old to change).

David G 2008-09-20 13:17:17

Answer 3

+2 A:

With some help from Dave Beazley (PLY's creator), my problem was solved.

The idea is to use special sub-rules and do the actions in them. In my case, I split the declaration rule to:

def p_decl_body(self, p):
    """ decl_body : declaration_specifiers init_declarator_list_opt
    """
    # <<Handle the declaration here>>        

def p_declaration(self, p):
    """ declaration : decl_body SEMI 
    """
    p[0] = p[1]

decl_body is always reduced before the token after SEMI is shifted in, so my action gets executed at the correct time.

Eli Bendersky 2008-09-20 15:28:02

ansaurus

tags:

views:

answers:

PLY: Token shifting problem in C parser

related questions