tags:

views:

16

answers:

1

This is the part of a string "21xy5". I want to insert " * " surrounded with whitespace between: digit and letter, letter and digit, letter and letter. I use this regex pattern "\d[a-z]|[a-z]\d|[a-z][a-z]" to find indexs where I gona insert string " * ". Problem is that when regex OR(|) in string 21xy5 trays to match 21-x|x-y|y-5, when first condition 21-x success, second x-y is not checked, and third success. So I have 21 * xy * 5 instead 21 * x * y * 5. If input string is like this xy21, then x-y success and then I have x * y21. Problem is that logical OR is not greedy.

    Regex reg = new Regex(@"\d[a-z]|[a-z]\d|[a-z][a-z]" );
    MatchCollection matchC;
    matchC = reg.Matches(input);
    int ii = 1;
    foreach (Match element in matchC)
    {
        input = input.Insert(element.Index + ii, " * ");
        ii += 3;
    }
    return input;
+1  A: 

You want lookarounds.

Regex reg = new Regex(@"(\d(?=[a-z])|[a-z](?=[a-z\d]))");

(Replace reg with $1 *)

The problem of your original regex is not greediness, but it will actually consume 2 characters. That means, when 1x is being matched, only y5 will be left available, so the regex engine cannot see the xy. OTOH, look-ahead is just a zero-width assertion, so the next character will not be consumed. e.g. while 1x together matches \d(?=[a-z]), only 1 will be consumed, so xy5 is available.

KennyTM
Thx, I understand now.
dontoo