ansaurus

Question

Composable Grammars

Answer 1

+1 A:

Parsing is one aspect of the problem, but I suspect that the inter-operation between the various executable interpreters that relate to each mini-language is probably significantly harder to solve. In order to be useful, each independent syntax block has to work consistently with the overall context (or the final behavior will be unpredictable, and thus unusable).

Not that I understand what they are really doing, but a very interesting place to look for more inspiration is FoNC. They seem (I am guessing) to be headed into a direction that allows all sorts of different computational engines to interact seamlessly.

Paul W Homer 2009-06-04 21:50:32

Answer 2

+2 A:

I am working on this exact problem. I'll share my thoughts:

Grammars are difficult to debug. I've debugged a few in Bison and ANTLR and it wasn't pretty. If you want the user to insert DSLs as grammar into your parser, then you've got to find some way to make it so it doesn't blow up. My approach is to not allow arbitrary DSLs, but to only allow those that follow two rules:

The token types (identifiers, strings, numbers) are the same between all DSLs in a file.
Unbalanced parentheses, braces, or brackets are not permitted

The reason for the first restriction is because modern parsers break parsing into a lexical stage and then apply your traditional grammar rules. Fortunately, I believe that a single universal tokenizer is good enough for 90% of the DSLs you want to create, even if it doesn't accommodate the DSLs you already created that you want to embed.

The second restriction allows the grammars to be more separated from each other. You can parse in two stages by grouping parentheses (braces, brackets) and then recursively parsing each group. The grammar of your embedded DSL can't escape through the brackets it's contained in.

Another part of the solution is to allow macros. For example, regex("abc*/[^.]") looks fine to me. This way the macro "regex" can parse the regex instead of building a regex grammer into the main language. You can't use different delimiters for your regex, sure, but you do gain a measure of consistency in my mind.

Dietrich Epp 2009-06-04 22:14:58

Answer 3

A:

If you think about it, this is really how recursive descent parsing works. Each rule and all the rules it depends on form a mini grammar. Anything higher up does not matter. You could, for example, write a Java grammar with ANTLR and separate all the different "mini-languages" into different parts of the file.

This isn't very common simply for the reason that these "mini-languages" will often share many rules. However, it definitely would be nice if tools like ANTLR allowed you to include separate grammars from different files. This would allow you to logically separate them. The reason this is probably not implemented is probably that it is a "cosmetic" issue, it's purely related to the grammar files themselves, not parsing itself. It also won't make your code any shorter (although it may be slightly easier to follow). The only technical issue this would solve is name collisions.

Zifre 2009-06-04 22:32:25

The problem is the terminals/"tokens". Defining a non left-recursive grammar for all of those regular expression "tokens" quickly becomes unmanageable.

Paul Hollingsworth 2009-06-04 22:38:56

Answer 4

+2 A:

You might want to listen to this podcast. Scanner-less parsing was "invented" to help solve the problem of composing different grammars (the problem being that you quickly find that you can't write a "universal" tokenizer/scanner).

Paul Hollingsworth 2009-06-04 22:46:00

Answer 5

+1 A:

Have a look at SGLR, Scannerless Generalized LR parsing. Here are some references and URLs. This parsing techniques makes composition of parsing tables very simple. Especially in combination with SDF.

Martin Bravenboer and Eelco Visser. Designing Syntax Embeddings and Assimilations for Language Libraries. In Models in Software Engineering: Workshops and Symposia at MoDELS 2007, volume 5002 of LNCS, 2008.

MetaBorg and MetaBorg in action

Zef Hemel 2009-06-09 19:31:09

ansaurus

tags:

views:

answers:

Composable Grammars

related questions