tags:

views:

169

answers:

4

For the given text:

This text A,is separated,by a comma A,unpreceded by the uppercase A,letter A,ok?,

Expected matches:

This text A,is separated,

and

by a comma A,unpreceded by the uppercase A,letter A,ok?,

Please write a regex that would work like described. Preferably one that will work with .NET

+4  A: 

I think the regex feature you're after is negative lookbehind. Try splitting on this regex:

(?<!A),

Eg.

new Regex(@"(?<!A),").Split("This text A,is separated,by a comma A,unpreceded by the uppercase A,letter A,ok?,");
Ben Lings
+1  A: 

The regular expression you want is:

MatchCollection matches = Regex.Matches(myInput, @"([^,]+)(?<!A),");

(Edited to favor ([^,]+) over (.*?), which should improve speed)

Take a look at this cheat sheet for more information.

Platinum Azure
+1  A: 

I have no experience with .NET but assuming its regular expression support is not vastly different from Java then both of these expressions will do what you want, albeit in different ways.

(?<!A),

Will match a comma that is not preceded by an 'A'. You can use that to find the index in the string where there isn't the expected delimiter and get a sub string up to that point.

Alternatively use this expression to get exactly the two matches you requested

.*?[^A],

It matches any characters up until it finds a comma that is not preceded by an 'A'.

KidDaedalus
A: 
static void Main(string[] args)
{
    string s = "ThisA,textA,isA,separated,byA,theA,absenceA,ofA,theA,letterA,\"a\",";
    string pattern = "[^A],";
    int startIndex = 0;
    foreach (Match line in Regex.Matches(s, pattern))
    {
        string chunk = s.Substring(startIndex, (line.Index + 1)-startIndex);
        Console.WriteLine(chunk);
        startIndex = line.Index + 2;
    }
    Console.Read();
}
Tim Hoolihan