ansaurus

Question

Convert an Extended Regular expression to .NET compatible RegEx

Answer 1

A:

Are you sure you don't have a typo in that? RegexBuddy (when set to either POSIX ERE or GNU ERE) says that the "+" quantifier must be preceded by a token that can be repeated. Other than that, this appears to be a valid .NET Regex. You might want to check out one of the great O'Reilly books on regular expressions as well. If this doesn't help, please post some examples of text you're trying to match/not match.

TrueWill 2009-08-28 23:07:35

It's not a typo, the OP just didn't use code formatting, so the SO software ate some of the characters.

Alan Moore 2009-08-29 01:50:55

Answer 2

A:

Actually according to what I've read here:

Extended Regular Expressions

It would appear that C# is basically using ERE - just a slightly different syntax.

However, if that was true, then from looking at your expression - it looks like you've made a group named "{7} .|={7}$|" that looks for anything that starts with a 7 followed by any character - and also an invalid + sign at the beginning of your statement - sooooo I'm guessing the stuff I'm finding via google searches are not the same ERE you are talking about :(

However! I have a site for you that should have just about everything you need to recreate your expression into a .net compatible one:

Regular Expressions in .net

Hope that link helps!

DataDink 2009-08-28 23:43:24

Check the question again; I added code formatting, so the regex makes much more sense now.

Alan Moore 2009-08-29 01:53:57

This regex is straight from a Subversion 1.6 installation. It works fine on *NIX systems, including my Mac OS machine, but the regex does not work if used in a C# application. The purpose of it is to identify if a diff between 2 revisions contains any of the characters used in a merge conflict file (<,=,>) 7 times in a row, with any 2 or more sets of them in a file (in POSIX ERE, best I can tell, the '\+' proceeding causes the group to match only if more than 1 of the OR conditions in the group matches. Testing on my Mac seems to confirm this)

2009-08-31 20:46:50

Answer 3

+1 A:

It looks to me like ERE syntax is mostly upward-compatible with .NET's regex flavor, as it is with most other "Perl-compatible" flavors (Perl, PHP, Python, JavaScript, Ruby, Java...). In other words, anything you can do in an ERE regex, you should be able to do in an identical .NET regex. Certainly your example:

^\+(<{7} \.|={7}$|>{7} \.)

means the same thing in .NET as it does in ERE. The only major exception I can see is in the area of POSIX bracket expressions; .NET follows the Unicode standard instead.

It's when you go to apply the regex that things really get different. In C# you might apply that regex like this:

string result = Regex.Match(targetString, @"^\+(<{7} \.|={7}$|>{7} \.)").Value;

C#'s verbatim strings save you having to escape backslashes like in some other languages' string literals; you only have to escape quotation marks, which you do by doubling them:

@"He said, ""Look out!""";

Does that answer your question?

Alan Moore 2009-08-29 01:47:40

That didn't answer it, but I did learn something new - I hadn't realized C# allowed the double-quote method of escaping quotes. I was using the "@" to declare the string literal.

2009-08-31 20:41:17

ansaurus

tags:

views:

answers:

Convert an Extended Regular expression to .NET compatible RegEx

related questions