views:

88

answers:

1

Hello,

I have some problems with a very simple yacc/lex program. I have maybe forgotten some basic steps (it's been a long time since I've used these tools).

In my lex program I give some basic values like :

word    [a-zA-Z][a-zA-Z]*
%%
":"    return(PV);
{word}  { 
            yylval = yytext;
            printf("yylval = %s\n",yylval);
            return(WORD);
       }
"\n"    return(ENDLINE);

In my yacc program the beginning of my grammar is (where TranslationUnit is my %start) :

TranslationUnit:
               /* Nothing */
              | InfoBlock Data
              ;

InfoBlock:
           /* Nothing */
         | InfoBlock InfoExpression {}
         ;

InfoExpression:
             WORD PV WORD ENDLINE { printf("$1 = %s\n",$1);
 printf("$2 = %s\n",$2);
 printf("$3 = %s\n",$3);
 printf("$4 = %s\n",$4);
                                  }
            | ... /* other things */
            ;

Data:
    ... /* other things */

When I run my program with input :

keyword : value

I thought that I would get at least :

$1 = keyword
$2 = keyword // yylval not changed for token PV
$3 = value
$4 = value // yylval not changed for token ENDLINE

Actually I get :

$1 = keyword : value
$2 = keyword : value
$3 = value
$4 = value

I do not understand this result. I have studied grammars some time ago and even if I don't remember everything right now, I do not see any important mistake...

Thanks in advance for your help.

+3  A: 

The trouble is that unless you save the token, Lex/Yacc goes on to over-write the space, or point to different space, etc. So you need to stash the information that's crucial to you before it gets modified. Your printing in the Lex code should have showed you that the yylval values were accurate at the point when the lexer (lexical analyzer) was called.

See also SO 2696470 where the same basic problem was encountered and diagnosed.

Jonathan Leffler
Thank you very much for your answer.Indeed, I got the good results with the printing in the Lex Code.To solve my problem, I tried to change the code by adding a buffer : InfoExpression: WORD { memset(buffer,'\0',BUFFER_LENGTH); sprintf(buffer,"%s",$1); } PV WORD { sprintf(buffer,"%s='%s'",buffer,$1); } ENDLINE {printf("%s\n",buffer);} ; But with this modification I still get : keyword='keyword : value'
Elenaher
(comments : continued)So I have used the following solution (maybe cleaner but I don't know...) : InfoExpression: InfoLValue PV InfoRValue ENDLINE {printf("%s\n",buffer);} ; InfoLValue: WORD {memset(buffer,'\0',BUFFER_LENGTH); sprintf(buffer,"%s",$1);} ; InfoRValue: WORD {sprintf(buffer,"%s='%s'",buffer,$1);} ;
Elenaher