views:

283

answers:

1

I have a Lua grammar, (minor modifications to get it to output for C#, just namespace directives and a couple of option changes) and when I run it on some sample input, it gives me back a tree with a root "nil" node and as childs what looks to be a tokenized version of the input code. It looks like ANTLR's tree grammars operate on hierarchical trees and not "flat" ones, so I don't think I can use the output as-is.

Is there an easy fix for the grammar or does it need to be rewritten from the ground up?

+2  A: 

Assuming your tree is just a 1 dimensional list of nodes, here's how you can create parent/sibling hierarchy:

In ANTLR there are two operators for AST creation:

!     excludes the node (token) from the (sub)tree;
^     makes a node the root of a (sub)tree.

When no operator is provided, the nodes/tokens are added as children of the current root. This is probably what has happened to you: all you see is a one dimensional list of nodes/tokens.

An example:

grammar Exp;

options {output=AST;}

// ... some rules ...

addition
  :  Integer '+'^ Integer ';'!
  ;

Integer
  :  '0'
  |  '1'..'9' '0'..'9'*
  ;

The addition rule will create the following tree for the expression 6+9;:

   +
  / \
 /   \
6     9

As you can see: the + is the root (it has ^ after it), the numbers are tokens (they have no operator) and the semi colon is excluded (it has a ! after it).

For a detailed explanation, see chapter 7, Tree Construction, from The Definitive ANTLR Reference. I highly recommend you get a hold of a copy.

The question whether you should start from scratch is for you to decide. I'd just start with an empty grammar file and gradually add rules to it checking frequently to see if all works. Simply sprinkling some tree operators in an existing grammar may be pretty hard to debug: especially if you're not too familiar with ANTLR.

Best of luck!

Bart Kiers
Ok, that doesn't sound half as bad as i was afraid of. Thanks for the help!
RCIX