I have a relatively simple DSL that I would like to handle more robustly than a bunch of manually-coded java.util.regex.Pattern
statements + parsing logic.
The most-quoted tool seems to be ANTLR. I'm not familiar with it and am willing to give it a try. However I get a little leery when I look at the examples (e.g. the ANTLR expression evaluator example, or Martin Fowler's HelloAntlr, or this other Q on stackoverflow). The reason for this is that the grammar files seem like they are a hodgepodge of grammar definitions interspersed with fragments of the implementation language (e.g. Java) that are imperative in nature.
What I would really prefer is to separate out the imperative / evaluation part of the parser. Is there a way to use ANTLR (or some other tool) to define a grammar & produce a set of Java source files so that it compiles into classes that I can use to parse input into a structure w/o acting upon that structure?
for example, if I wanted to use expression evaluation with just the +
and *
and ()
operators, and I had the input
3 * (4 + 7 * 6) * (3 + 7 * (4 + 2))
then what I would like to do is write a grammar to convert that to a hierarchical structure like
Product
Term(3)
Sum
Term(4)
Product
Term(7)
Term(6)
Sum
Term(3)
Product
Term(7)
Sum
Term(4)
Term(2)
where I can use classes like
interface Expression<T> {
public T evaluate();
}
class Term implements Expression<Double> {
final private double value;
@Override public Double evaluate() { return value; }
}
class Product implements Expression<Double> {
final private List<Expression<Double>> terms;
@Override public Double evaluate() {
double result = 1;
for (Expression<Double> ex : terms)
result *= ex.evaluate();
return result;
}
}
class Sum implements Expression<Double> {
final private List<Expression<Double>> terms;
@Override public Double evaluate() {
double result = 0;
for (Expression<Double> ex : terms)
result += ex.evaluate();
return result;
}
}
and use ANTLR to construct the structure. Is there a way to do this? I would really rather pursue this approach, as it lets me (and other software engineers) edit and visualize complete Java classes without having to have those classes fragmented into weird pieces in ANTLR grammar files.
Is there a way to do this?
clarification: I want to spend as much of my effort as possible in two ways: defining the grammar itself, and in ANTLR-independent Java (e.g. my Product/Sum/Term classes). I want to minimize the amount of time/experience I have to spend learning ANTLR syntax, quirks and API. I don't know how to create and manipulate an AST from ANTLR grammar. Because this is only a small part of a large Java project, it's not just me, it's anyone in my team that has to review or maintain my code.
(I don't mean to sound impertinent: I'm willing to make the investment of time and energy to use a tool, but only if the tool becomes a useful tool and does not continue to become a stumbling block.)