views:

122

answers:

1

Is it possible to generate a parse tree at the same time as I use recursive descent parser to check if the data matches grammar?

If so, what approach would I use to build a tree as I recursively descent?

Thanks, Boda Cydo.

Note: I am new to parsing. (Asked several questions on SO already, and I am getting better with it.)

+1  A: 

Yes, it's possible. How to do so will depend on the implementation you want. Here's a sample that might work for you:

First, define your node:

class ParseTreeNode {
  private final String name;
  private final List<ParseTreeNode> children = /* new */;
  public ParseTreeNode(String name) {
    this.name = name;
  }
  public void addChild(ParseTreeNode child) {
    children.add(child);
}

Next, you'll need to integrate that into your recursive descent functions:

class RDParser {
  ParseTreeNode parse(Input input) {
    ParseTreeNode root = createParseTreeNodeNamed("Root")
    switch (input.nextToken()) {
      case OPT1:
        root.addChild(createParseTreeNodeNamed("Opt1"));
        break;
      case OPT2:
        while (/*someCondition*/) {
          root.addChild(createParseTreeNodeNamed("Opt2-sibling" + /* i */));
        }
      case SUBTREE:
        ParseTreeNode subtree = createParseTreeNodeNamed("Subtree");
        root.addChild(subtree);
        parseSubtree(subtree, input);
        break;
      default:
        error("Input %s was not in the expected first/follow sets", input.nextToken());
    }
  }
  void parseSubtree(ParseTreeNode node, Input input) {
    node.addChild(createParseTreeNodeNamed("subtree-child"));
    /* ... */
  }

  /* and other functions do similarly */
  ParseTreeNode createParseTreeNodeNamed(String name) {
    return new ParseTreeNode(name);
  }
}

As you descend down your parse tree, you'll probably want to send whatever the new "root" node is, so that children can be added to it. Alternatively, parseSubtree could create and return a node, which would then be added to the root node.

You could build either the parse tree or a simple abstract tree using the above process. Since the parse function returns the root node, which will reference any and all children nodes, you'll have full access to the parse tree after parsing.

Whether you use a heterogeneous or homogeneous parse tree, you'll need a way to store sufficient information to make it useful.

Kaleb Pederson
Excellent answer, Kaleb. It got me going instantly, and I think I will be able to write it now myself! But can you please clarify on difference between "`parse tree`" and "`abstract tree`", and "`heterogeneuous`" and "`homogeneous`" parse trees? (I don't know the difference yet but I would love to know!)
bodacydo
homogeneous - A tree that consists of the same type of nodes. heterogeneous - A tree consisting of nodes of different types. A parse tree represents the structure of the input data and usually contains unnecessary syntax. An abstract syntax tree is one that maintains the essential structure and information but eliminates unnecessary structure or syntax. I modified my post to show how the tree grows deeper -- I hope that clarifies.
Kaleb Pederson
Thanks for explaining! I am already implementing. :) I'll ask if I get stuck. My tree is going to be abstract, heterogeneous tree. :)
bodacydo