views:

120

answers:

3

Hi Friends,

I want to write a code to search for method defination and methods called in a c# file.

So obviously my pattern should search for text like

1.public void xyz(blahtype blahvalue);

2.string construct = SearchData(blahvalue);

Has anyone done similar to this, is Regex helpful in this case. if yes provide me the pattern. Any other workarounds. I dont know reflection(will it help in my case)


Thanks, you guys gave it a try, i did not know this wud be so complex. All i wanted to do was suppose i have method like this

public method1(int val)

{

method2();

method3();

}

void method2(int val2)

{

method4()

}

i wanted to construct a string as Method1:Method2:method4 and Method1:Method3....

I guess its really complex

+2  A: 

With reflection you can load an assembly and find out what methods etc it contains so that sounds suitable for what you need unless I've misunderstood the question and you mean that you want to look in the source files.

First Load the Assembly, then get out the Types and then you can get the methods for each type.

Type.GetMethods
http://msdn.microsoft.com/en-us/library/4d848zkb.aspx

Assembly.GetTypes
http://msdn.microsoft.com/en-us/library/system.reflection.assembly.gettypes.aspx

ho1
Please give me a regex solution :(
sajad
You **cannot** search for a method in compiled code using a RegEx. So I do not think anyone will be able to provide a "RegEx solution". I am sorry.
scherand
Sorry for all this confusion..I am reading through File, not assembly
sajad
still.. you cannot get method names!
Abdel Olakara
@Abdel Olakara: Why should this not be possible? Sajad is looking through source code, so the method names are in there. Even the compiled assembly contains all method names in plaintext...
Jens
@Jens: I'm sure it's possible to figure out this in some way since the C# compiler does it, but I think it's quite complicated. Your answer below will pick up `if` and `for` statements for example and as you said there might also be strings or comments being picked up too.
ho1
@Jens , I thought he is parsing a complied file. And I didn't know that even complied files had method names in plain text .. thanks for info
Abdel Olakara
+1  A: 

If you want to search code files using regex, try this one:

(?<=^|;|\{|\})[^;{}]*\s([\w.]+)\s*\([^;{}]*

It should match every line with a method definition or call, and have the name of the method in its first capturing group. Try RegExr to look at it in action.

This regex relies on the fact that method calls and definitions are the only thing that is followed by an opening bracket. This is not true for string content, of course, so strings with brackets in them, or comments, will cause this expresion to report false positives that you will have to filter out manually.

Edit: As "ho" pointed out in the comment to another answer, this regex will of course pick up if and for. More filtering, I guess. =)

Jens
+1  A: 

If the code compiles you could actually compile it at runtime and use reflection to obtain the method definitions. Obtaining the method calls will be a bit trickier because you have to analyze the IL code of all methods. As far as I know there is no good support for this type of tasks build into the framework but you can use a library like Cecil to simplify the job.

Regarding using regular expressions I am not sure if they are powerful enough. Matching method definitions seams to be the easy part but even this is non trivial. I tried to give an example expression but gave up.

There are many modifiers and they are not allowed to occur in any order (while my attempt allows also invalid combinations). The modifiers are followed by the return type. This seams simple at a first look but it is not. The return type may be a generic type with arbitrarily many and arbitrarily deep nested type arguments. My attempt does not allow generics at all.

The method name will be quiet easy but my attempt is currently not correct - the name must not start with a number, you can use @ in method names and there are probably some more missing points. Then the parameter list. There may be generic types again and the modifiers ref and out. Finally there may be generic type constraints. And not to forget pointer types in unsafe contexts.

So I really doubt you should do it using regular expression besides you are only interested in a rough estimate or very basic cases. Because languages with matched nested brackets are not regular languages and generic type names may contain matched nested angle brackets it is not possible to identify only correct method definitions without using any extensions to regular expressions. And this was only the simple method definition - method invocation will be a lot more complex.

 ((public|protected|internal|private|static|abstract|sealed|extern|override|new|virtual)\s+)*[a-zA-Z0-9_]+\s+[a-zA-Z0-9_]+\s*\(.*\)
Daniel Brückner