ansaurus

Question

What should my AST look like for easy transforms?

Answer 1

A:

I use ANLTR.org which I found very easy.

kenny 2009-01-06 15:05:22

Answer 2

A:

Okay, I won't recommend the Dragon book, since you already have it. Can I recommend the Appel books? There is even some source code demonstrating the concepts, among them using the Visitor pattern for translating Abstract Syntax Trees to Concrete Syntax Trees. Good luck, and enjoy!

Alan 2009-01-07 03:26:06

Answer 3

+1 A:

As Alan mentioned, the Appel books are great. I had Modern Compiler Implementation in ML for an undergraduate course on compilers.

I would personally avoid doing many transformations on an AST simply because of the number of different constructs you can have and the number of ways the same thing can be expressed. You will frequently have to write code that handles a large number of cases and sub-cases, and as you said, it gets unwieldly very quickly.

It is better to transform the AST to a more minimal representation such as basic blocks in a control flow graph. Each operation can then be written as a simple statement in a basic block. The set of possible operations should be kept small. Just make sure to keep enough information that you can still do all the transformations you want (in particular, don't throw away types). You can also use Single Static Assignment form, where each program variable gets assigned only once. This provides an invariant which simplifies a lot of transformations and analyses.

Jay Conrod 2009-01-13 19:12:49

Answer 4

A:

ASTs represent the structure of program. For complex languages, your AST will necessarily be complicated. So don't assume this should be "easy".

Many folks assume that having an AST, life will be easier. But parsing is just the foothills of the Himalayas. ASTs don't represent the common inferences, such as the meaning of an identifier, what statement executes next, where this data is consumed. Unless you have all these available, you're not going to be able to do much with a real language, let alone do it easily.

It is best to make those inference results cached or explicit: symbol tables, control flows, data flows, ...

One can add pattern matching langauges to enable the recognition of syntax structures, or even to write transformation rules:

optimize_increment(x:exp):statement
= " \x=\x+1; " -->  " \x++ " if no_side_effects(x);

Such rules need to draw on the cached inferences (eg., "side_effects").

Trying to build all this is really hard. One way to make this practical is to amortize the infrastructure cost across many lanuages and transformation applications. Our DMS Software Reengineering Toolkit has all this machinery to varying degree for a wide variety of langauges, including C, C++, Java, C# and COBOL.

Ira Baxter 2009-07-12 10:09:51

ansaurus

tags:

views:

answers:

What should my AST look like for easy transforms?

related questions