views:

196

answers:

7

Does anyone out there know about examples and the theory behind parsers that will take (maybe) an abstract syntax tree and produce code, instead of vice-versa. Mathematically, at least intuitively, I believe the function of code->AST is reversible, but I'm trying to find work/examples of this... besides the usual resources like the Dragon book and such. Any ideas?

+1  A: 

Such thing is called a Visitor. Is traverses the tree and does whatever has to be done, for example optimize or generate code.

lewap
Could you please provide a link to some references (wikipedia, papers)?
Dervin Thunk
Visiotr Pattern. What kind of research are you doing?
lewap
+1  A: 

I rather like lewap's response:

find a mathematical way to express a visitor and you have a dual to the parser

But you asked for a sample, so try this on for size: Visual Studio contains a UML editor with excellent symmetry. The way both it and the editors are implemented, all constitute views of the model, and editing either modifies the model resulting in all remaining in synch.

Peter Wone
A: 

I don't know where to find much about the theory, but boost::spirit 2.0 has both qi (parser) and karma (generator), sharing the same underlying structure and grammar, so it's a practical implementation of the concept.

Documentation on the generator side is still pretty thin (spirit2 was new in Boost 1.38, and is still in beta), but there are a few bits of karma sample code around, and AFAIK the library's in a working state and there are at least some examples available.

puetzk
A: 

In addition to 'Visitor', 'unparser' is another good keyword to web-search for.

Brian
Another buzz word is "pretty printing" and "pretty printing combinators".
Martijn
A: 

That sounds a lot like the back end of a non-optimizing compiler that has it's target language the same as it's source language.

One question would be whether you require the "unparsed" code to be identical to the original, or just functionally equivalent.

For example, would it be OK for the output to use a different indentation style than the original? That information wouldn't normally be stored in the AST because it's not semantically important.

One thing to look at would be automatic code refactoring tools.

Clayton
A: 

I've been doing these forever, and calling them "DeParse".

It only gets tricky if you also want to recapture whitespace and comments. You have to tuck them into the parse tree so you can regenerate them on output.

Mike Dunlavey
Personally I actually like the removal of semantically null material like whitespace because it means presentation must be formalised (eg stylesheet) resulting in consistency.
Peter Wone
@Peter: Same here. At times (I can't remember when) it has been desirable to store source in the form of a parse tree so it could be manipulated somehow, without losing the comments.
Mike Dunlavey
While Peter makes a valid point, refactoring programs discarding layout (whitespace) is annoying and discarding comments is just plain unacceptable for users.
Martijn
+1  A: 

Actually, generating code from a parse tree is strictly easier than parsing code, at least in a mathematical sense. There are many grammars which are ambiguous, that is, there is no unique way to parse them, but a parse tree can always be converted to a string in a unique way, modulo whitespace.

The Dragon book gives a good description of the theory of parsers.

Jørgen Fogh
You're absolutely right, Jørgen.
Mike Dunlavey