views:

142

answers:

3

Antlr users usually create a parser that generates the AST(Abstract syntax tree), and a walker that walks through the AST and generate the desired outcomes. As we know, java(C++, python...etc) codes has to be injected into the .g walker files to carry out the execution. However, when the target gets complicated, say we are to develop a walker that processes JAVA language, we have to pass a lot of context information between the rules in the walker through scope, parameter, or global variables, and this makes the walker so ugly and hard to maintain.

So, my questions is that, given the parser and walker(without any java code yet) done, what is the common practice to software-engineering java(or others) codes in the waker, and the rest of the java codes?

+1  A: 

Many open source parser generators in Java are available. JavaCC is widely used, as discussed in this Wikipedia article. As a relatively simple example, it is used in SourceMeter, as modified to count Java constructors, fields and methods.

trashgod
@Bart: Thank you!
trashgod
No problem, it sure looked a bit awkward! :)
Bart Kiers
@trashgod, I know how to use javaCC. I think one of the best part of it is that the parser and the handling codes can be totally separated. It makes software engineering so easy. However, antlr is more powerful in terms of its supportability. I have to use antlr in my projects. That's why I am looking for a way to make it more organized.
Winston Chen
@WCC: Ah, I misunderstood the ANTLR requirement; my experience was with javaCC. I'm sure you've seen the Java grammars for ANTLR: http://www.antlr.org/grammar/list
trashgod
@Bart: Eye knead two yews my spell chequer wright! http://en.wikipedia.org/wiki/Spell_checker
trashgod
@trashgod, Thank you for the link. Yes, I have seen this. However one bad thing about antlr is that we always have to put some java codes in the .g grammar file, and this makes software engineering hard.
Winston Chen
+1  A: 

Use the Strategy pattern.

Write an interface that describes what you want to do action-wise (createSymbol(), pushScope(), defineType()..) and pass an implementation of it to the grammar. This keeps the code in the grammar minimal, and allows you to pass in different (or Decorated) implementations.

Your implementation can keep track of the data you need rather than passing it around in the grammar. Think of it as managing all the state you need, and all you do is call its methods from the grammar. The only state in the grammar is a pointer to the strategy implementation.

Does that help? -- Scott

Scott Stanchfield
Great insight, Scott. This the state idea is really cool!! This actually shift the focus from walker to your java data structure. Thanks a lot.
Winston Chen
+1  A: 

You have some basic choices:

  1. Pass around parameters, keep the code in the walker.
  2. Member variables in the parser/tree walker; essentially add state to the parse as suggested by Scott Stanchfield. The code stays in the walker.
  3. Member variables and methods in the AST nodes; derive from the AST node class and create your own class(es) as required. Code in the AST node implementation and the walker.

Of course you can combine them if you want to.

janm
Actually, I was suggesting creating a separate Strategy class that tracked the state - the only state in the parser would be a pointer to the strategy instance.
Scott Stanchfield