tags:

views:

73

answers:

5

Hi guys:

Could you expand on why the output of Console.WriteLine(m.Groups[1]); would be "Contoso, Inc"? And could you also elaborate on the matching steps of this example? Thanks.

PS: I'm not familar with the concept of groups

string input = "Company Name: Contoso, Inc."; 

Match m = Regex.Match(input, @"Company Name: (.*$)"); 

Console.WriteLine(m.Groups[1]);
+1  A: 

The docs contain all the information about groups, but I think that each time you use parentheses in the pattern, you create a matching group. Groups[0] is the text matched by the pattern, and Groups[i] (where i> 0) represent the nth matched pattern in parentheses. You could also name groups.

Timores
+2  A: 

I'm not familar with the concept of groups

A group is a part of the regular expression that is saved while doing the matching. To declare a group in your regular expression, you put a part of the expression between parentheses. That part is then saved inside a group.

Groups are numbered (although they can also be given explicit names), from left to right, from the outside to the inside. The “zero-th” group is the whole match.

In your case, you print the first group, which is just the last part of the string, i.e. everything behind “Company Name:” up until the end of the line.

Konrad Rudolph
Can you elaborate more on the matching steps of this example? Thanks.
Ricky
We see "Company Name: ", then we capture all other symbols into the group number 1, because ".*$" matches them all.
wRAR
+1  A: 

Groups are denoted by a set of parenthesis. Basically your regex is saying: In order to match, any incoming string will have to start with Company Name: and the rest of the string can be any characters, the .*, including the end of the string, $. Since .*$ is in parenthesis, you have said that you want to group that match. Remember that you could have more groups. The entire input string is always Groups[0] (if it matches that is), which is why your expression (.*$) is in Groups[1].

klausbyskov
+1  A: 

I'm studying Regular Expressions right now, so this is some good practice for me :)

The regular expression you use is "Company Name: (.*$)" and you use the RegExOptions.SingleLine to match your string (this is the default option).

"Company Name: " will be matched starting from anywhere in the string (if you would have used "^Company Name: " you would have said that "Company Name: " has to be the first part of the string. ( ^ = beginnig of the string)

(.*$) this is a unnamed group.

Inside this group you are matching for ".*$" , that translates into: any character ".", taken 0 or more times "*", until the end of the string "$"

Clear? :)

Ando
`(?.*$)` is a not valid expression.
Tim Pietzcker
You are right - I removed the remark regarding (?.*$)
Ando
+1  A: 

If you are interesting in learning the ins and outs of Regular Expressions (along with how the different implementations vary based on language and platform), then I definitely recommend Mastering Regular Expressions from O'Reilly.

Nick