tags:

views:

118

answers:

3

So I'm trying to match up a regex and I'm fairly new at this. I used a validator and it works when I paste the code but not when it's placed in the codebehind of a .NET2.0 C# page.

The offending code is supposed to be able to split on a single semi-colon but not on a double semi-colon. However, when I used the string

"entry;entry2;entry3;entry4;"

I get a nonsense array that contains empty values, the last letter of the previous entry, and the semi-colons themselves. The online javascript validator splits it correctly. Please help!

My regex:

((;;|[^;])+)
+5  A: 

Split on the following regular expression:

(?<!;);(?!;)

It means match semicolons that are neither preceded nor succeeded by another semicolon.

For example, this code

var input = "entry;entry2;entry3;entry4;";
foreach (var s in Regex.Split(input, @"(?<!;);(?!;)"))
    Console.WriteLine("[{0}]", s);

produces the following output:

[entry]
[entry2]
[entry3]
[entry4]
[]

The final empty field is a result of the semicolon on the end of the input.

If the semicolon is a terminator at the end of each field rather than a separator between consecutive fields, then use Regex.Matches instead

foreach (Match m in Regex.Matches(input, @"(.+?)(?<!;);(?!;)"))
    Console.WriteLine("[{0}]", m.Groups[1].Value);

to get

[entry]
[entry2]
[entry3]
[entry4]
Greg Bacon
Thanks! Too bad I was so far off with my original. This one leaves a trailing empty entry, any thoughts on how to get rid of that one?
C Bauer
@C Bauer See updated answer.
Greg Bacon
+1  A: 

Why not use String.Split on the semicolon?

string sInput = "Entry1;entry2;entry3;entry4";
string[] sEntries = sInput.Split(';');
// Do what you have to do with the entries in the array...

Hope this helps, Best regards, Tom.

tommieb75
The "but not on double semicolons" requirement makes this *kind of* ugly.
Austin Salonen
the problem is that he doesn't want to split on double-semicolon (;;) hence String.Split() is inadequate for him.
DrJokepu
Sorry Tom, this would not work because it would split on ALL semicolons and I need it to skip over double semi-colons, as stated in the original question.
C Bauer
@DrJokepu: If you look at his sample input, there is no double semicolon...and anyway if there was there would be an empty element in the offset in the array
tommieb75
You can tell split not to return empty values, use the http://msdn.microsoft.com/en-us/library/system.stringsplitoptions.aspx
thijs
+1  A: 

As tommieb75 wrote, you can use String.Split with StringSplitOptions Enumeration so you can control your output of newly created splitting array

string input = "entry1;;entry2;;;entry3;entry4;;";
char[] charSeparators = new char[] {';'};
// Split a string delimited by characters and return all non-empty elements.
result = input.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);

The result would contain only 4 elements like this:

<entry1><entry2><entry3><entry4>
nemke
Please read the original question to see why this does not work. I knew only regex would work before I asked the question.
C Bauer
So you would like to split a;b;c;;d to [a][b][c;;d] or [a][b][c][d]. If it's the second, you can still use Split, but if it's the first I will delete my answer.
nemke