tags:

views:

56

answers:

2

Hello

Can someone explain to me why the result from the following statement has a count of two an not just one?

MatchCollection matches = new Regex( ".*" ).Matches( "foo" ) ;
Assert.AreEqual( 1, matches.Count ) ; // will fail!

new Regex( ".+" ).Matches( "foo" ) ; // returns one match (as expected)
new Regex( ".*" ).Matches( "" ) ; // also returns one match 

(I'm using C# of .NET 3.5)

A: 

Include '^' to anchor your matching expression at the start of the input string.

MatchCollection matches = new Regex( "^.*" ).Matches( "foo" ) ;
Steve Townsend
This is a workaround, but I still don't understand, why the original .* Regex leads to two matches
miasbeck
I guess I should have clarified your intent. What are you trying to do?
Steve Townsend
Never mind. The ^ anchor is okay.
miasbeck
+4  A: 

The expression "*." matches "foo" at the start of the string, and an empty string at the end (position 3). Remember, * means, "zero or more". So it matches "nothing" at the end of the string.

This is consistent. Regex.Match(string.Empty, ".*"); returns one match: an empty string.

Jim Mischel
But the * is greedy, so I assumed there is nothing left to match (not even an empty string) after the first match. I could also argue that there are an infinite number of empty strings at the end of the input not just one. In either case, I'm puzzled an I have to re-write some of my Unit tests to reflect this behavior.
miasbeck