views:

89

answers:

2

Hi there.

For a pet project I started to fiddle with ANTLR. After following some tutorials I'm now trying to create the grammar for my very own language and to generate an AST.

For now I'm messing around in ANTLRWorks mostly, but now that I have validated that the parse tree seems to be fine I'd like to (iteratively, because I'm still learning and still need to make some decisions regarding the final structure of the tree) create the AST. It seems that antlrworks won't visualize it (or at least not using the "Interpreter" feature, Debug's not working on any of my machines).

Bottom line: Is the only way to visualize the AST the manual way, traversing/showing it or printing the tree in string representation to a console?

What I'm looking for is a simple way to go from input, grammar -> visual AST representation a la the "Interpreter" feature of ANTLRWorks. Any ideas?

A: 

You must change target language to Java for ANTLRWorks interpreter to work, or at least that's what I observed.

Lex Li
The interpreter works fine for me - but it shows the parse tree. Even if I have options { output = AST; ... } I get the very same tree. _Token_^ and _Token_! aren't respected. Not what I want..
Benjamin Podszun
No, the interpreter in ANTLRWorks ignores the `language=XYZ` part of the `options { ... }` header from a grammar. At least, ANTLRWorks 1.3, 1.3.1 and 1.4 ignore it.
Bart Kiers
+3  A: 

Correct, the interpreter only shows what rules are used in the parsing process, and ignores any AST rewrite rules.

What you can do is use StringTemplate to create a Graphviz DOT-file. After creating such a DOT-file, you use some 3rd party viewer to display this tree (graph).

Here's a quick demo in Java (I know little C#, sorry).

Take the following (overly simplistic) expression grammar that produces an AST:

grammar ASTDemo;

options { 
  output=AST; 
}

tokens {
  ROOT;
  EXPRESSION;
}

parse
  :  (expression ';')+ -> ^(ROOT expression+) // omit the semi-colon
  ;

expression
  :  addExp -> ^(EXPRESSION addExp)
  ;

addExp
  :  multExp
     ( '+'^ multExp
     | '-'^ multExp
     )*
  ;

multExp
  :  powerExp
     ( '*'^ powerExp
     | '/'^ powerExp
     )*
  ;

powerExp
  :  atom ('^'^ atom)*
  ;

atom
  :  Number
  |  '(' expression ')' -> expression // omit the parenthesis
  ;

Number
  :  Digit+ ('.' Digit+)?
  ;

fragment
Digit
  :  '0'..'9'
  ;

Space
  :  (' ' | '\t' | '\r' | '\n') {skip();}
  ;

First let ANTLR generate lexer and parser files from it:

java -cp antlr-3.2.jar org.antlr.Tool ASTDemo.g 

then create a little test harness that parses the expressions "12 * (5 - 6); 2^3^(4 + 1);" and will output a DOT-file:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class MainASTDemo {
    public static void main(String[] args) throws Exception {
        ANTLRStringStream in = new ANTLRStringStream("12 * (5 - 6); 2^3^(4 + 1);");
        ASTDemoLexer lexer = new ASTDemoLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        ASTDemoParser parser = new ASTDemoParser(tokens);
        ASTDemoParser.parse_return returnValue = parser.parse();
        CommonTree tree = (CommonTree)returnValue.getTree();
        DOTTreeGenerator gen = new DOTTreeGenerator();
        StringTemplate st = gen.toDOT(tree);
        System.out.println(st);
    }
}

Compile all .java files:

// *nix & MacOS
javac -cp .:antlr-3.2.jar *.java

// Windows
javac -cp .;antlr-3.2.jar *.java

and then run the main class and pipe its output to a file named ast-tree.dot:

// *nix & MacOS
java -cp .:antlr-3.2.jar MainASTDemo > ast-tree.dot

// Windows
java -cp .;antlr-3.2.jar MainASTDemo > ast-tree.dot

The file ast-tree.dot now contains:

digraph {

    ordering=out;
    ranksep=.4;
    bgcolor="lightgrey"; node [shape=box, fixedsize=false, fontsize=12, fontname="Helvetica-bold", fontcolor="blue"
        width=.25, height=.25, color="black", fillcolor="white", style="filled, solid, bold"];
    edge [arrowsize=.5, color="black", style="bold"]

  n0 [label="ROOT"];
  n1 [label="EXPRESSION"];
  n1 [label="EXPRESSION"];
  n2 [label="*"];
  n2 [label="*"];
  n3 [label="12"];
  n4 [label="EXPRESSION"];
  n4 [label="EXPRESSION"];
  n5 [label="-"];
  n5 [label="-"];
  n6 [label="5"];
  n7 [label="6"];
  n8 [label="EXPRESSION"];
  n8 [label="EXPRESSION"];
  n9 [label="^"];
  n9 [label="^"];
  n10 [label="^"];
  n10 [label="^"];
  n11 [label="2"];
  n12 [label="3"];
  n13 [label="EXPRESSION"];
  n13 [label="EXPRESSION"];
  n14 [label="+"];
  n14 [label="+"];
  n15 [label="4"];
  n16 [label="1"];

  n0 -> n1 // "ROOT" -> "EXPRESSION"
  n1 -> n2 // "EXPRESSION" -> "*"
  n2 -> n3 // "*" -> "12"
  n2 -> n4 // "*" -> "EXPRESSION"
  n4 -> n5 // "EXPRESSION" -> "-"
  n5 -> n6 // "-" -> "5"
  n5 -> n7 // "-" -> "6"
  n0 -> n8 // "ROOT" -> "EXPRESSION"
  n8 -> n9 // "EXPRESSION" -> "^"
  n9 -> n10 // "^" -> "^"
  n10 -> n11 // "^" -> "2"
  n10 -> n12 // "^" -> "3"
  n9 -> n13 // "^" -> "EXPRESSION"
  n13 -> n14 // "EXPRESSION" -> "+"
  n14 -> n15 // "+" -> "4"
  n14 -> n16 // "+" -> "1"

}

which can be viewed with one of the many viewers around. There are even online viewers. Take this one for example: http://graph.gafol.net/

When feeding it the contents of ast-tree.dot, the following image is produced:

alt text

Bart Kiers
Whoa, thanks for the detailed answer. +1 for that (seems to do what I want). I'll give it a try during the day and will probably accept it afterwards. It seems a faster (i.e. no compilation for every change involved, "live" preview against a sample input/stream) way doesn't exist, sadly.
Benjamin Podszun
You're welcome @Benjamin. If you're using Eclipse, you might want to try the plugin `ANTLR IDE`: http://antlrv3ide.sourceforge.net/ I believe it has the option to create the image of an AST on the fly. But I have no personal experience with it.
Bart Kiers