views:

128

answers:

1

In the simplified example, there are 2 Regular Expressions, one case sensitive, the other not. The idea would be to efficiently create an IEnumerable collection (see "combined" below) combining the results.

string test = "abcABC";
string regex = "(?<grpa>a)|(?<grpb>b)|(?<grpc>c)]";
Regex regNoCase = new Regex(regex, RegexOptions.IgnoreCase);
Regex regCase = new Regex(regex);

MatchCollection matchNoCase = regNoCase.Matches(test);
MatchCollection matchCase = regCase.Matches(test);

//Combine matchNoCase and matchCase into an IEnumerable
IEnumerable<Match> combined= null;
foreach (Match match in combined)
{
    //Use the Index and (successful) Groups properties 
    //of the match in another operation

}

In practice, the MatchCollections might contain thousands of results and be run frequently using long dynamically created REGEXes, so I'd like to shy away from copying the results to arrays, etc. I am still learning LINQ and am fuzzy on how to go about combining these or what the performance hits to an already sluggish process will be.

+1  A: 

There are three steps here:

  1. Convert the MatchCollection's to IEnumerable<Match>'s
  2. Union the sequences
  3. Filter by whether the Match.Success property is true

Code:

IEnumerable<Match> combined = matchNoCase.OfType<Match>().Union(matchCase.OfType<Match>()).Where(m => m.Success);

Doing this creates a new enumerator which only executes each step as the next result is fetched, so you only end up enumerating through each collection once, total. For example, Union() will only start executing the enumerator of the second sequence after the first runs out.

Rex M