I am trying to make a parser rule which allows for zero or more of a token before a second rule and for which each successive token - of those which were part of the closure - is, in the AST, a child of the previous token, and the second rule is also a child of the last symbol.
easier to explain by example...
expression11 : ((NOT | COMPLEMENT)^)* expression12;
For example, given the above parser rule, if I have the expression !!x (where x is an ID), I want, in my AST, the x to be the child of the second bang operator which is the child of the first.
Desired:
!
\ child
!
\ child
x
Instead of my desired behavior, the above line produces an AST for which the second bang operator is a child of the first, but the x is a child of the first bang operator, a sibling of the second one. Obviously not what I want for a unary operator.
Encountered behavior:
!
child / \ child
x -sib- !
If I add a third operator (as in "!!!x") the third one becomes a child of the second, as expected, and x remains a child of the first, sibling of the second.
I thought perhaps I could fix this by surrounding the entire operator part with parenthesis and adding another caret, such as
expression11 : (((NOT | COMPLEMENT)^)*)^ expression12;
in an effort to force expression12 to be a child of the entire closure of operators, hoping in vain that this would be interpreted as "The child of the entire closure means the child of the most-descended," but that was not the case and doing this did not change the behavior at all.
My question is "How do I get the parser to process the rule in such a way that the result of expression12 becomes the child of the most-descended 'NOT' or 'COMPLEMENT' node instead of the highest ancestor one?"
I would have thought this would be simple, but I cannot figure it out from the Antlr resources on antlr.org nor by pleading with Google. It must be done all the time, or is there a different way to structure the rule entirely which I am overlooking?
Here are the following rules for completeness. They are not finished yet and will be modified, but they are complete and working for testing and all is well with them - as expected since they are straightforward. 12 is for array length and method calls, 13 is for new classes and arrays, 14 for array indexing, and 15 for terminals/parenthesis.
expression12 : expression13 (DOT (LENGTH | (ID LPAREN (expression (COMMA expression)*)? RPAREN)))?;
expression13 : expression14 | (NEW^ ((ID LPAREN RPAREN) | (INTTYPE LSQBRACK expression RSQBRACK)));
expression14 : expression15 (LSQBRACK expression RSQBRACK)*;
expression15 : (LPAREN expression RPAREN) | INTLIT | TRUE | FALSE | ID | THIS;
Thank you to anyone who can provide assistance; your time is much appreciated.