tags:

views:

60

answers:

3

Dear all, I've a regular expression problem, and I guess I'm missing a point in how the regex actually work.

I've some set of strings that contains methods definitions

  • myMethod1()
  • myMethod2(argument1 arg1)
  • myMethod3(argument1 arg1, argument2 arg2)

but some of them also contains the output type:

  • myOtherMethod1() : type1
  • myOtherMethod2(argument1 arg1) : type2
  • myOtherMethod3(argument1 arg1, argument2 arg2) : whatever

I want to only have inputs like the first ones: take out the output paramaters. I've taken out my regex hat, and I come up with some conditional regex:

(?([:]+)(.+(?=\s:))|(.+))

If I match a ":" character in my string, I take whatever is before the " :", if not I take all. Theoretically this is correct, but it returns the whole line If I change the regex to

(?([:]*)(.+(?=\s:))|(.+))

Then the second type of methods are correctly regexed, but not the first ones (strange..). Can you explain me where is my mistake?

Thank you very much,

+1  A: 

There's no need to handle it like this. Just take up to the first right parentheses you encounter:

/^[^)]*\)/

Unless I'm misunderstanding your problem...

strager
Actually your solution works great, my approach of solving the problem was not the good one. Thanks!Still I don't understand why my regex doesnt work, in theory it should right?
Srodriguez
No. I think you may want `.*` before the conditional statement (i.e. `(?(.*[:]*)...`).
strager
surely you don't need `[:]`, as this is just the same as `:`
David Kemp
´(?(.*:+)(.+(?=\s:))|(.+))´ (modified version applying both comments didnt work out either. It did 2 matches:for eg "myOtherMethod1() : type1" gave two matches: "myOtherMethod1()" and " type1"
Srodriguez
@Srodriguez, Precisely as it should. Remove the last group (`|(.+)`) to not match the ` type1` part.
strager
A: 

I don't know what the problem is with your regex. I'd use a simpler regex and just match against want you want:

^.*\)

This matches a start of line, followed by any character, followed by a ). The fact the there may be text after the ) is irrelevant.

Benedict Cohen
A: 

I came up with ^(.*?\)).*$

This matches the start of the string (^), followed by any characters up to (and including) the first close parenthesis (.*? is lazy match any character, so it will match up to the first ) (the \) is required to escape the string). The rest of the line is ignored (.*$) - make you use the multi-line option, else you'll only match the first part. The extra parenthesis are the capturing group, so you can use $1 as the replacement.

David Kemp