tags:

views:

191

answers:

3

Basically I want to retrieve all possible substring matches with n characters from a string, Here's my initial code but it only returns 2 matches.

String input = "abc12345abcd";
Regex  regex = new Regex(@"[A-Za-z]{3}"); //this will only return 2 matches
MatchCollection  matches = regex.Matches(input);

How should I get the following matches using regex?

abc
abc
bcd

Is this possible, if not will LINQ help this?

Thanks,

A: 

This sounds like a similar question to this.

GrayWizardx
The input string can be any characters not just hex characters
jerjer
I understand, but the principal is the same. You are looking for the longest continuous substring of a given number of characters.
GrayWizardx
a bit, but i am looking for different substring word combinations let say: "passingx" i want to get all possible words(substring) i.e. passing passin passi pass pas assingx assing assin assin assi ass ssingx ssing ssi singx sing sin ing ingx ing ngx respectively
jerjer
+2  A: 
String input = "abc12345abcd";
Regex regex = new Regex(@"[A-Za-z]{3}");
int i=0;
while(i<input.Length){
    Match m=regex.Match(input,i);
    if(m.Success){
     Console.WriteLine(m.Value);
     i=m.Index+1; //just increment one char, instead of length of match string
    }else break;
}

Results

abc
abc
bcd
S.Mark
This works, thanks!!!!!!!!!
jerjer
You're welcome!
S.Mark
a follow question: what if I will have a regex [A-Za-z]{3,} how can get same results like abc abcd bcd
jerjer
Yes, `[A-Za-z]{3,}` will give you `abc abcd bcd`, its mean 3 characters and above will match
S.Mark
if i have this string "passingx" i want to get all possible words(substring) i.e. passing passin passi pass pas assingx assing assin assin assi ass ssingx ssing ssi singx sing sin ing ingx ing ngx respectively can this be done on a single regex exp? thanks
jerjer
[A-Za-z]{3,} will not return the result because it starts after the next match, if will increment i by 1 every iteration i will duplicate results when the loop encounters a non alpha chars.
jerjer
+1  A: 

I believe that, while not clearly documented, Matches returns non-overlapping matches -- so the second match for abc means there's nothing returned for bcd, as it would be overlapping.

To get overlapping matches, you can program a loop calling the Match (singular) method to get one match object at a time; as long as the match object has the Success property as true, you keep looping with the second argument to the Match method being one more than the Index property of the previous match object (to get the next match whether overlapping or not).

Alex Martelli
thanks, this works per S.Mark code snippets
jerjer