ansaurus

Question

Implement word boundary states in flex/lex (parser-generator)

Answer 1

A:

%%
[a-zA-Z]+a[a-zA-Z]* {printf("a in word: %s\n", yytext);}
a[a-zA-Z]+ {printf("a in word: %s\n", yytext);}
a {printf("a not in word\n");}
. ;

Testing:

user@cody /tmp $ ./a.out <<EOF
> a
> ba
> ab
> a
> EOF
a not in word

a in word: ba

a in word: ab

a not in word

2009-01-03 06:30:47

Answer 2

A:

Here's what accomplished what I wanted:

%{
#include <stdio.h>
%}

WC      [A-Za-z']
NW      [^A-Za-z']

%start      INW NIW

{WC}  { BEGIN INW; REJECT; }
{NW}  { BEGIN NIW; REJECT; }

<INW>a { printf("'a' in word\n"); }
<NIW>a { printf("'a' not in word\n"); }

This way I can do the equivalent of \B or \b at the beginning or end of any pattern. You can match at the end by doing a/{WC} or a/{NW}.

I wanted to set up the states without consuming any characters. The trick is using REJECT rather than yymore(), which I guess I didn't fully understand.

ʞɔıu 2009-01-05 15:39:58

ansaurus

tags:

views:

answers:

Implement word boundary states in flex/lex (parser-generator)

related questions