Theory, examples of reversible parsers?

views:

196

answers:

Theory, examples of reversible parsers?

Does anyone out there know about examples and the theory behind parsers that will take (maybe) an abstract syntax tree and produce code, instead of vice-versa. Mathematically, at least intuitively, I believe the function of code->AST is reversible, but I'm trying to find work/examples of this... besides the usual resources like the Dragon book and such. Any ideas?

+1 A:

Such thing is called a Visitor. Is traverses the tree and does whatever has to be done, for example optimize or generate code.

lewap 2009-03-19 12:35:53

Could you please provide a link to some references (wikipedia, papers)?

Dervin Thunk 2009-03-19 14:08:53

Visiotr Pattern. What kind of research are you doing?

lewap 2009-03-19 16:22:11

+1 A:

I rather like lewap's response:

find a mathematical way to express a visitor and you have a dual to the parser

But you asked for a sample, so try this on for size: Visual Studio contains a UML editor with excellent symmetry. The way both it and the editors are implemented, all constitute views of the model, and editing either modifies the model resulting in all remaining in synch.

Peter Wone 2009-03-19 12:58:38

I don't know where to find much about the theory, but boost::spirit 2.0 has both qi (parser) and karma (generator), sharing the same underlying structure and grammar, so it's a practical implementation of the concept.

Documentation on the generator side is still pretty thin (spirit2 was new in Boost 1.38, and is still in beta), but there are a few bits of karma sample code around, and AFAIK the library's in a working state and there are at least some examples available.

puetzk 2009-03-19 13:01:18

In addition to 'Visitor', 'unparser' is another good keyword to web-search for.

Brian 2009-03-20 01:12:37

Another buzz word is "pretty printing" and "pretty printing combinators".

Martijn 2009-06-17 11:05:48

That sounds a lot like the back end of a non-optimizing compiler that has it's target language the same as it's source language.

One question would be whether you require the "unparsed" code to be identical to the original, or just functionally equivalent.

For example, would it be OK for the output to use a different indentation style than the original? That information wouldn't normally be stored in the AST because it's not semantically important.

One thing to look at would be automatic code refactoring tools.

Clayton 2009-03-20 01:42:36

I've been doing these forever, and calling them "DeParse".

It only gets tricky if you also want to recapture whitespace and comments. You have to tuck them into the parse tree so you can regenerate them on output.

Mike Dunlavey 2009-03-20 21:35:30

Personally I actually like the removal of semantically null material like whitespace because it means presentation must be formalised (eg stylesheet) resulting in consistency.

Peter Wone 2009-03-24 12:52:47

@Peter: Same here. At times (I can't remember when) it has been desirable to store source in the form of a parse tree so it could be manipulated somehow, without losing the comments.

Mike Dunlavey 2009-03-24 21:15:32

While Peter makes a valid point, refactoring programs discarding layout (whitespace) is annoying and discarding comments is just plain unacceptable for users.

Martijn 2009-06-17 11:05:09

+1 A:

Actually, generating code from a parse tree is strictly easier than parsing code, at least in a mathematical sense. There are many grammars which are ambiguous, that is, there is no unique way to parse them, but a parse tree can always be converted to a string in a unique way, modulo whitespace.

The Dragon book gives a good description of the theory of parsers.

Jørgen Fogh 2009-06-17 10:51:35

You're absolutely right, Jørgen.

Mike Dunlavey 2009-06-17 12:21:10

ansaurus

tags:

views:

answers:

Theory, examples of reversible parsers?

related questions