tags:

views:

273

answers:

3

I wrote an application which makes use of a meta-parser generated using CSharpCC (a port of JavaCC). Everything works fine and very good I can say.

For the nature of the project, I would like to have more flexibility on the possibility to extend the syntax of the meta-language used by the application. Do you know any existing libraries (or articles describing the process of implementation) for Java or C# which I could use to programatically implement my own parser, without being forced to rely to a static syntax?

Thank you very much for the support.

+1  A: 

Would Scala's combinator parsers do the trick for you? Since Scala compiles to Java bytecode, anything you write could be called from your Java code however you please.

yonkeltron
+1 for parser combinators. A quick google for "C# parser combinators" yields http://blogs.msdn.com/lukeh/archive/2007/08/19/monadic-parser-combinators-using-c-3-0.aspx
dtb
parser combinators seem to be the answer, but they lead too far from my scope: in fact, I should be able to implement the parsing mechanism directly inside the library, which is oriented to embedded environments and small footprints applications.
Antonello
A: 

Take a look at the way that the JNode command-line interface handles parsing of command line arguments. Each command 'registers' descriptors for the arguments it is expecting. The command line syntax is specified separately in XML descriptors, allowing users to tailor a command's syntax to meet their needs.

This is underpinned by a framework of Argument classes that are basically context sensitive token recognizers, and a two level grammar / parser. The parser 'prepares' a user-friendly form of a command syntax into something like BNF, then does a naive backtracking parse, accepting the first complete parse that it finds.

The downside of the current implementation is that the parser is inefficient, and probably impractical for parsing input that is more than 20 or so tokens, depending on the syntax. (We have ideas for improving performance, but a real fix is probably not possible without a major redesign ... and banning potentially ambiguous command syntaxes.)

(Aside: one motivation for this is to support intelligent command argument completion. To do this, the parser runs in a "completion" mode in which it explores all possible (partial) parses, noting its state when it encounters the token / position that the user is trying to complete. Where appropriate, the corresponding Argument classes are then asked to provide context sensitive completions for the current "word".)

Stephen C
A: 

The parser (written in C#) used in the Heron language (a simple object-oriented language) is relatively simple and stable, and should be easy to modify for your needs. You can download the source here.

cdiggins