tags:

views:

252

answers:

4

Hello,

This is my first experience with C# and part of my limited experience with regular expressions and I'm having trouble capturing the first occurrence of a match in a particular expression. I believe the following example would make it more clear than words in describing what I want to do.

Match extractor = (new Regex(@".*\d(?<name>.*)\d.*")).Match("This hopefully will pick up 1Bob9error1 as a name");
        Console.WriteLine(extractor.Groups["name"]);

I would like to this expression to print "Bob" instead of "error".

I have a hunch it has something to do with the ? in front of the matching group, but I'm not exactly sure what operation the ? performs in this particular case. An explanation along with some help would be wonderful.

Thanks guys, you have no idea how much this site helps a beginning programmer like me.

+2  A: 

Matches the preceding element zero or one time. It is equivalent to {0,1}. ? is a greedy quantifier whose non-greedy equivalent is ??.

Taken from here. Site includes a cheat-sheet for regular expressions, and looking at your expression I can't seem to figure out what may be wrong with it.

My assumption is that it might be matching the last occurrence of your expression.

Anthony Forloney
I could be wrong, but I don't think the ? is a quantifier in this case. Don't quantifiers usually follow an expression?
Chad
There is nothing wrong with the expression, he's just looking at the second value that it is capturing instead of the first.
Rory
@Chad The ? in this context indicates the start of a named group (more specifically the one named "name"). In most other places it is used as a quantifier.
Rory
+1  A: 

Each Group item has a Captures collection, you can access the first capture for a group using:

extractor.Groups["name"].Captures[0]
Rory
P.S. Each Capture item has a Value property which returns the actual string value of the capture, there are also some other useful properties like the index at which the capture starts in the original string and the length of the capture. When in doubt, hit F1.
Rory
Hmmmm...helpful information, but Captures[0] still picks up "error"Is something wrong with me regular expression?
Chad
+6  A: 

Your problem is greed. Regex greediness that is. Your .* at the start grabs all this "This hopefully will pick up 1Bob" . try this Regex instead:

\d(?<name>[^\d]+)\d
nickyt
Chad, I'd recommend you install RegExWorkbench, http://code.msdn.microsoft.com/RegexWorkbench . It's an old project but awesome made by Eric Gunnerson. If you do not have the .NET 1.x framework installed, he provides the source so that you can compile it using the framework you have installed.
nickyt
That actually picked up "Bob". Awesome. I think my problem was I treating this regular expression as if it had to match the whole string, when really I was only asking if there was a match in it. Thanks a bunch
Chad
You have to remember that with regexes, . and * can be very evil. When you craft your regex, you really need to know what you want to find and only that. It takes time, but once you do that consistently, you'll end up with rock solid regexes. I'd also recommend this site, http://www.regular-expressions.info and that you read this book, Mastering Regular Expressions, http://regex.info .
nickyt
+1  A: 

The bracketing * characters around your expression are causing your trouble. Remember you don't need a regular expression that matches the entire string - you want it to match only a particular pattern when it appears. The following code works:

    Regex pattern = new Regex(@"\d(?<name>.*?)\d");
    MatchCollection matches = pattern.Matches("This hopefully will pick up 1Bob9error1 as a name");
    Console.WriteLine(matches[0].Groups["name"]);
John Christensen