ansaurus

Question

Answer 1

+1 A:

AFAIK, the error is because nestingBlockCommentCharacters can match +/ (the ~'/' twice).

Personally, I'd keep the nestingBlockComment as a lexer rule instead of a parser rule. You can do that by adding a little helper method in the lexer class:

public boolean openOrCloseCommentAhead() {
  // return true iff '/+' or '+/' is ahead in the character stream
}

and then in a lexer comment-rule, use a gated semantic predicates with that helper method as the boolean expression inside the predicate:

// match nested comments
Comment
  :  '/+' (Comment | {!openOrCloseCommentAhead()}?=> Any)* '+/'
  ;

// match any character
Any
  :  .
  ;

A little demo-grammar:

grammar DComments;

@lexer::members {
  public boolean openOrCloseCommentAhead() {
    return (input.LA(1) == '+' && input.LA(2) == '/') ||
           (input.LA(1) == '/' && input.LA(2) == '+');
  }
}

parse
  :  token+ EOF
  ;

token
  :  Comment {System.out.println("comment :: "+$Comment.text);}
  |  Any
  ;

Comment
  :  '/+' (Comment | {!openOrCloseCommentAhead()}?=> Any)* '+/'
  ;

Any
  :  .
  ;

and a main class to test it:

import org.antlr.runtime.*;

public class Main {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream(
            "foo /+ comment /+ and +/ comment +/ bar /+ comment +/ baz");
        DCommentsLexer lexer = new DCommentsLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        DCommentsParser parser = new DCommentsParser(tokens);
        parser.parse();
    }
}

Then the following commands:

java -cp antlr-3.2.jar org.antlr.Tool DComments.g 
javac -cp antlr-3.2.jar *.java
java -cp .:antlr-3.2.jar Main

(for Windows, the last command is: java -cp .;antlr-3.2.jar Main)

produce the following output:

comment :: /+ comment /+ and +/ comment +/
comment :: /+ comment +/

Bart Kiers 2010-07-18 20:10:21

ansaurus

tags:

views:

answers:

An antlr problem with embedded comments

related questions