views:

857

answers:

5

I'm planning to write a C# 3.0 compiler in C#. Where can I get the grammar for parser generation?

Preferably one that works with ANTLR v3 without modification.

+2  A: 

Are you looking for something like this or this?

Please also refer to C# ANLTR grammar question.

Rubens Farias
The linked grammar is not C# 3.0. It doesn't support lambdas. That's specifically important to me.
Mehrdad Afshari
It would seem that adding support for lambdas in terms of existing constructs in the grammar is fairly trivial, since you only need to define argument list. This will probably need LL(*), however, since you can parse something like `(a**` and not know if this will end up being an expression like `(a**b)` (i.e. multiply `a` by the result of a dereference of `b`), or a lambda expression `(a** b) =>`, until you hit the `=>`. Since there's no limit on amount of indirection (pointer to pointer to ...), it looks like it it's LL(*) to me. But since ANTLR3 supports opt-in LL(*), it's not a problem.
Pavel Minaev
@Pavel: It's not just that. It doesn't support generics. I'll probably write my own parser or grammar from scratch if I can't find a reasonably good C# 3.0 grammar.
Mehrdad Afshari
A: 

There isn't one apart from bits in ECMA docs and modified all over the place, and even that is insufficient. The language that is supposed to be open has enough quirks with 2.0 parsing to not bother..

That's why you see a lot of handmade parsers, and semantics will show up sooner or later.

Lambdas can easily target 2.0 (ie. same CLR 2.0 as always at least before NET 4.0) and often are, and all you have to do is get to the IL and decompiled anonymous method (via x tools out there).. probably not what you're looking for either, especially as it requires multiple roundtrips.

Parsing proprietary LINQ (and for what really), automatic properties (what a farce facility) and similar is a total suicide and waste of brain cells.. If you really need something badly, best to talk to JetBrain guys or stick to Mono.

If MS wants to know why people want control and less stupidity in not giving choices of inlining, deciding on struct heuristics for us, poor generics and 'using'-s non-DRYs, and much more, it's very simple (plenty of valid engineering and well documented reasons out there):

Reversibility / Machine-readability / Independence from next (5.0, 6.0, etc) idiot-waves and maintaining an investment in existing sources. That reads tech-neutral..

rama-jka toti
As I mentioned in a comment, it's mostly an experiment and fun project. It's not expected to have business value.
Mehrdad Afshari
Sure Mehrdad, but for us and many people we work with, it is of critical importance to switch an OS, vendor, db, compiler, language, stack, etc and not be locked-in. Even if you are building tools for your own use, eventually you'll hit on proprietary bits and start the maintainance headache. The message MS is never picking up that "open" means goodness for them, but they always "wonder" why people run away to alternatives after years of headaches in lock-in land.. Your best bet is Mono, Coco/R, RSharp and friends.. but frankly I wouldn't bother with any of it, been there..
rama-jka toti
@Majkara - dude, Mehrdad already said that the project was for experimentation and fun. Why so serious?
Kev
Mehrdad is cool and his rep says it all. Only did it because big bad MSDN is reading this and always 'wondering' and 'preaching' what we will be NOT able to do in the future.. locking, bloating and dumbing down the world really.
rama-jka toti
I think they probably ignore such.
JoshJordan
+3  A: 

Take a look at COCO/R it seems that they have the language specification for C# 3.0.

ErvinS
+6  A: 

Take a look at C# Language Specification. In the chapter B. Grammar you'll find the grammar.

Michael Damatov
Yeah, of course the spec contains grammar. However, the grammar in that *Word document* is scattered through the whole doc and is unsuitable for parser generation.
Mehrdad Afshari
It's not *only* scattered throughout; we have an appendix at the end with the whole thing in one place. You are probably right that it would take some modification to make it work for a parser generator.
Eric Lippert
Eric: Oh, didn't notice that section. Thanks for pointing out.
Mehrdad Afshari
Micheal: It's less than 40 pages long. When I think about it, it's possible to deal with it and start from scratch. +1
Mehrdad Afshari
+2  A: 

I ran into ANTLR C# Grammar on CodePlex. It's a relatively new project and uses ANTLR 3.2. It says it supports C# 4.0 and is licensed under the Eclipse Public License (EPL).

I played with it a little. It has a bunch of test files containing expressions. It supports lambdas, unsafe context, ... as you'd naturally expect. It parses a C# file and hands you an abstract syntax tree. You can do whatever you want with it.

Mehrdad Afshari