



I need C# string search algorithm which can match multiple occurance of pattern. For example, if pattern is 'AA' and string is 'BAAABBB' Regex produce match result Index = 1, but I need result Index = 1,2. Can I force Regex to give such result?


Any regular expression can give an array of MatchCollection

Would be nice, if you could paste some demo code for this.
This is why I added the link to MSDN...
+12  A: 

Use a lookahead pattern:-


This finds any A that is followed by another A without consuming the following A. Hence AAA will match this pattern twice.

+3  A: 

To summarize all previous comments:

Dim rx As Regex = New Regex("(?=AA)")
Dim mc As MatchCollection = rx.Matches("BAAABBB")

This will produce the result you are requesting.

Here is the C# version (working with VB.NET today so I accidentally continued with VB.NET).

Regex rx = new Regex("(?=AA)");
MatchCollection mc = rx.Matches("BAAABBB");
Sani Huttunen

Try this:

       System.Text.RegularExpressions.MatchCollection  matchCol;
       System.Text.RegularExpressions.Regex regX = new System.Text.RegularExpressions.Regex("(?=AA)");

        string index="",str="BAAABBB"; 
        matchCol = regX.Matches(str);
        foreach (System.Text.RegularExpressions.Match mat in matchCol)
                index = index + mat.Index + ",";

The contents of index are what you are looking for with the last comma removed.


pattern '(?=A)' gives good results but enormously exten calc time. I have a string with 20M characters and calc speed is very important. Does anyone has other solution? Thanks.


Are you really looking for substrings that are only two characters long? If so, searching a 20-million character string is going to be slow no matter what regex you use (or any non-regex technique, for that matter). If the search string is longer, the regex engine can employ a search algorithm like Boyer-Moore or Knuth-Morris-Pratt to speed up the search--the longer the better, in fact.

By the way, the kind of search you're talking about is called overlapping matches; I'll add that to the tags.

Alan Moore