views:

354

answers:

5

I'm looking for a set of classes (preferably in the .net framework) that will parse C# code and return a list of functions with parameters, classes with their methods, properties etc. Ideally it would provide all that's needed to build my own intellisense.

I have a feeling something like this should be in the .net framework, given all the reflection stuff they offer, but if not then an open source alternative is good enough.

What I'm trying to build is basically something like Snippet Compiler, but with a twist. I'm trying to figure out how to get the code dom first.

I tried googling for this but I'm not sure what the correct term for this is so I came up empty.

Edit: Since I'm looking to use this for intellisense-like processing, actually compiling the code won't work since it will most likely be incomplete. Sorry I should have mentioned that first.

+2  A: 

While .NET's CodeDom namespace provides the basic API for code language parsers, they are not implemented. Visual Studio does this through its own language services. These are not available in the redistributable framework.

You could either...

  1. Compile the code then use reflection on the resulting assembly
  2. Look at something like the Mono C# compiler which creates these syntax trees. It won't be a high-level API like CodeDom but maybe you can work with it.

There may be something on CodePlex or a similar site.

UPDATE
See this related post. http://stackoverflow.com/questions/81406/parser-for-c

Josh Einstein
+1 for update too - it contains workable solutions
John K
A: 

Have a look at CSharpCodeCompiler in Microsoft.CSharp namespace. You can compile using CSharpCodeCompiler and access the result assembly using CompilerResults.CompiledAssembly. Off that assembly you will be able to get the types and off the type you can get all property and method information using reflection.

The performance will be pretty average as you will need to compile all the source code whenever something changes. I am not aware of any methods that will let you incrementatlly compile snippets of code.

Igor Zevaka
A: 
jrista
This is not implemented by any of the code dom providers and throws a NotImplementedException.
Josh Einstein
@Josh: Seems you are correct. I just tried, and it does indeed fail. Such a bummer.
jrista
+1  A: 

If you need it to work on incomplete code, or code with errors in it, then I believe you're pretty much on your own (that is, you won't be able to use the CSharpCodeCompiler class or anything like that).

There's tools like ReSharper which does its own parsing, but that's prorietary. You might be able to start with the Mono compiler, but in my experience, writing a parser that works on incomplete code is a whole different ballgame to writing one that's just supposed to spit out errors on incomplete code.

If you just need the names of classes and methods (metadata, basically) then you might be able to do the parsing "by hand", but I guess it depends on how accurate you need the results to be.

Dean Harding
Yea I'm beginning to consider parsing it by hand. Not sure how difficult this will be with generics though.
Blindy
A: 

Mono project GMCS compiler contains a pretty reusable parser for C#4.0. And, it is relatively easy to write your own parser which will suite your specific needs. For example, you can reuse this: http://antlrcsharp.codeplex.com/

SK-logic
The problem with these already-made parsers is that they won't work for incomplete (and thus invalid) code. Their purpose is to create a syntax tree detailed enough to generate code, not to provide data for intellisense.
Blindy
Yep. But, as they are reusable, one can easily tweak them. ANTLR may be used for a partial parsing. But of course the most generic option is PEG, so if you can get hold on a decent PEG implementation for .NET, and you can port an existing, say, ANTLR parser, you'll get a quick and easy generic solution.For example, a Packrat parser from http://www.meta-alternative.net/mbase.html is capable of generating syntax highlighting modes for a text editor, out of any generic syntax, and it work well with incomplete or invalid input.
SK-logic