I am reading the book Language Implementation Patterns (http://tiny.cc/235xs) amongst a few others mixed in to clarify concepts as well as the occasional website. I am trying to make a tool that reads a trivial programming language and performs some basic analysis on it.
I am getting stuck in the design phase of this tool. I have constructed a simple handwritten recursive decent parser that validates a source file just fine. However, to perform source manipulations having a CodeDom tree would be useful.
The questions:
1) Are the logical steps a tool like this takes: Parse and build a textual tree and matching symbol table and then convert this to a CodeDom?
2) When building a textual tree, the most convenient would be a AST, easier to convert to a CodeDom .. but do Refactoring tools maintain a list of all embedded tokens in a statement in order to preserve inline comments and how do they track this in their tree?