I'm trying to use Antlr for some text IDE-like functions -- specifically parsing a file to identify the points for code folding, and for applying syntax highlighting.
First question - is Antlr suitable for this requirement, or is it overkill? This could be achieved using regex and/or a hand-rolled parser ... but it seems that Antlr is out there to do this work for me.
I've had a look through the ... and the excellent tutorial resource here.
I've managed to get a Java grammar built (using the standard grammar), and get everything parsed neatly into a tree. However, I'd have expected to see elements nested within the tree. In actual fact, everything is a child of the very top element.
Eg. Given:
package com.example
public class Foo {
String myString = "Hello World"
// etc
}
I'd have expected the tree node for Foo to be a child of the node for the package declaration. Likewise, myString would be a child of Foo.
Instead, I'm finding that Foo
and myString
(and everything else for that matter) are all children of package
.
Here's the relevant excerpt doing the parsing:
public void init() throws Exception {
CharStream c = new ANTLRFileStream(
"src/com/inversion/parser/antlr/Test.code");
Lexer lexer = new JavaLexer(c);
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaParser parser = new JavaParser(tokens);
parser.setTreeAdaptor(adaptor);
compilationUnit_return result = parser.compilationUnit();
}
static final TreeAdaptor adaptor = new CommonTreeAdaptor() {
public Object create(Token payload) {
if (payload != null)
{
System.out.println("Create " + JavaParser.tokenNames[payload.getType()] + ": L" + payload.getLine() + ":C" + payload.getCharPositionInLine() + " " + payload.getText());
}
return new CommonTree(payload);
}
};
Examining result.getTree()
returns a CommonTree
instance, whose children are the result of the parsing.
Expected value (perhaps incorrectly)
package com.example (4 tokens)
|
+-- public class Foo (3 tokens)
|
+--- String myString = "Hello World" (4 tokens)
+--- Comment "// etc"
(or something similar)
Actual value (All values are children of the root node of result.getTree()
)
package
com
.
example
public
class
Foo
String
myString
=
"Hello World"
Is my understanding of how this should be working correct?
I'm a complete noob at Antlr so far, and I'm finding the learning curve quite steep.