views:

1643

answers:

8

Is it currently possible to translate C# code into an Abstract Syntax Tree?

Edit: some clarification; I don't necessarily expect the compiler to generate the AST for me - a parser would be fine, although I'd like to use something "official." Lambda expressions are unfortunately not going to be sufficient given they don't allow me to use statement bodies, which is what I'm looking for.

+3  A: 

Check out .NET CodeDom support. There is an old article on code project for a C# CodeDOM parser, but it won't support the new language features.

There is also supposed to be support in #develop for generating a CodeDom tree from C# source code according to this posting.

Rob Walker
A: 

Please see the R# project (sorry the docs are in Russian, but there are some code examples). It allows AST manipulations on C# code.

http://www.rsdn.ru/projects/rsharp/article/rsharp_mag.xml

Project's SVN is here: (URL updated, thanks, derigel)

Also please see the Nemerle language. It is a .Net language with strong support for metaprogramming.

Alexander Gladysh
Repository now is at http://svn.rsdn.ru/svn/RSharp/
derigel
+7  A: 

Is it currently possible to translate C# code into an Abstract Syntax Tree?

Yes, trivially in special circumstances (= using the new Expressions framework):

// Requires 'using System.Linq.Expressions;'
Expression<Func<int, int>> f = x => x * 2;

This creates an expression tree for the lambda, i.e. a function taking an int and returning the double. You can modify the expression tree by using the Expressions framework (= the classes from in that namespace) and then compile it at run-time:

var newBody = Expression.Add(f.Body, Expression.Constant(1));
f = Expression.Lambda<Func<int, int>>(newBody, f.Parameters);
var compiled = f.Compile();
Console.WriteLine(compiled(5)); // Result: 11

Notice that all expressions are immutable so they have to be built anew by composition. In this case, I've prepended an addition of 1.

Notice that these expression trees only work on real expressions i.e. content found in a C# function. You can't get syntax trees for higher constructs such as classes this way. Use the CodeDom framework for these.

Konrad Rudolph
Erik accepted this? It uses the very lambda forms he said he didn't want.
Ira Baxter
Ira: you should pay attention to the development of the discussion. This entry was posted *before* Erik’s edit/clarification. Apparently, none of the other answers were better *at the time* (notice: *one year ago!*) so he didn’t accept another answer. Your answer is probably what he would have wanted.
Konrad Rudolph
+2  A: 

It looks like this sort of functionality will be included with whatever comes after C# 4, according to Anders Hejlsberg's 'Future of C#' PDC video.

Erik Forbes
This is helpful to see what C# don't offer a library for us to manipulate C# API. It is due to it's compiler is a classical one, a black box!
yeeen
A: 

The ANTLR Parser Generator has a grammar for C# 3.0 which covers everything except for LINQ syntax.

Erik Forbes
I've used ANTLR in the past, and it's quite nice. I haven't used the C# grammar, but most of the contributors there are pretty cluey.
Travis
+1  A: 

The C# front end for DMS parses full C# 3.0 including LINQ and produces ASTs. DMS in fact is an ecosystem for analyzing/transforming source code using ASTs for front-end provided langauges.

EDIT 3/10/2004: ... Now handles full C# 4.0

Ira Baxter
+1  A: 

ANTLR is not very useful. LINQ is not what u want.

Try Mono.Cecil! http://www.mono-project.com/Cecil

It is used in many projects, including NDepend! http://www.ndepend.com/

yeeen
A: 

I've just answered on another thread here at StackOverflow a solution where I implemented an API to create and manipulate AST from C# Source Code

Dinis Cruz