views:

100

answers:

3

Suppose I'm having white spaces (WS) in the hidden channel. And for a particular rule alone, I want white spaces also to be considered, is it possible to bring WS to the default channel for that particular rule alone in the parser?

A: 

Have look at the answer for your path question, notice how I put a '\n' into the parser rule. You should be able to put ' ' as well. Now, do all the options for your WS on the hidden channel need to be in the rule would be the only concern.

eg

rulename : Token1 ' ' Token2 ' ' Token1 {place action here};

Please note that the rule name starts with a lowercase letter and it is a parser rule while the "Token#" start with uppercase letter and are lexer rules. In between the different tokens the rule requires a space in this example, and I suppose you could put something like (' '|'\t'|'\r'|'\n')+ but I have not tried this and will leave that for you to attempt.

WayneH
A: 

You can always query the hidden token stream

ie in C++

myrule: MYTOK { static_cast<antlr::CommonHiddenStreamToken*>(LT(1).get())->getHiddenAfter()->getType() == WS}? MYTOK 

The semantic predicate will check to see if there is a whitespace token after matching the lexical token MYTOK

chollida
A: 

Lexer rules are evaluated in the order they are listed in your grammar file.

This means you can have something like this:

STRING_LITERAL: '"' NONCONTROL_CHAR* '"';   


fragment NONCONTROL_CHAR: LETTER | DIGIT | UNDERSCORE |  SPACE | BACKSLASH | MINUS | COMMA;
fragment LETTER: LOWER | UPPER;
fragment LOWER: 'a'..'z';
fragment UPPER: 'A'..'Z';
fragment DIGIT: '0'..'9';
fragment SPACE: ' ' | '\t';
fragment UNDERSCORE: '_';   
fragment MINUS:  '-';
fragment BACKSLASH: '\\';

COMMA: ',';     

NEWLINE: ('\r'? '\n')+ { $channel = HIDDEN; };
TERMINATOR  : ';';


WHITESPACE: SPACE+ { $channel = HIDDEN; };

LINE_COMMENT
    :   
    '//' ~('\n'|'\r')*  ('\r\n' | '\r' | '\n') 
    {
        $channel = HIDDEN;
    }
    |   
    '//' ~('\n'|'\r')*     
    {
        $channel = HIDDEN;
    }
    ;   

As you can see a string literal can have space or tabs in it. However a stand alone space or tab will be sent to the HIDDEN channel.

Darien Ford
That's a good point Darien, but I thought he wanted white space as part of a rule not a lexical token.ie `rule: rule1 WS rule2` not `TOKEN_WITH_WS`
chollida