tags:

views:

76

answers:

2

I want to make a rule in flex to consume a c-style comment like /* */

i have the following

c_comment "/*"[\n.]*"*/"

But it doesn't ever get matched. Any idea why? if you need more of my code please let me know and I'll submit the whole thing. Thanks to anyone who replies.

+1  A: 

I suggest you use start conditions instead.

%x C_COMMENT

"/*"            { BEGIN(C_COMMENT); }
<C_COMMENT>"*/" { BEGIN(INITIAL); }
<C_COMMENT>.    { }

%x C_COMMENT defines the C_COMMENT state, and the rule /* has it start. Once it's started, */ will have it go back to the initial state (INITIAL is predefined), and every other characters will just be consumed without any particular action.

The %x definition makes C_COMMENT an exclusive state, which means the lexer will only match rules that are "tagged" <C_COMMENT> once it enters the state.

zneak
I'd also like to say, on a related note, that there __must not__ be any space between the `<condition>` and the rule following it.
zneak
thanks for the help, that is what I did and it worked
Silmaril89
A: 

Not sure why it's not being picked up but I do know that a pattern of that sort can produce large lexical elements. It's more efficient to detect just the start comment marker and toss everything in the bitbucket until you find the end marker.

This site has code which will do that:

"/*" {
    for (;;) {
        while ((c = input()) != '*' && c != EOF)
            ; /* eat up text of comment */
        if (c == '*') {
            while ((c = input()) == '*')
                ;
            if (c == '/')
                break; /* found the end */
        }
        if (c == EOF) {
            error ("EOF in comment");
            break;
        }
    }
}
paxdiablo
I'm not sure it's really good to consume input that way. =/ Isn't that a mix of concerns?
zneak
I usually tend towards pragmatism than dogmatism :-)
paxdiablo
I see only one concern here, and that is eating up the comment so you can proceed with lexing real tokens. However, you could argue that this example is not taking advantage of the abstraction mechanisms that flex offers to make what you're doing clearer.
Nate C-K
paxdiablo
@paxdiablo: IMO this is as good a way as any to solve this particular problem. C-style comments can't be expressed very cleanly in the lex/flex framework so you might as well just write some code to handle it, as you've done. This has the advantage of not requiring lex states, which I feel make a grammar harder to follow. My comment was more in response to zneak's: as long as the code here is strictly doing lexical analysis (which it is), I feel it is in the right place and does not present a problem regarding separation of concerns.
Nate C-K