views:

244

answers:

6

MSVS: Where's the regex ?

I have code that I'm trying to match with a regular expression in MSVS 2008, but I can't figure out the regex for it. Take the classic example:

colou?r

...which is a regular expression that matches color or colour. This matches neither in MSVS. Referring to the help file, I cannot find ?.

This wouldn't be a big deal - it can be emulated with alternation:

colo(u|)r

However, I get "Grouped expression is missing ')'."... which it's... not. Oddly, MSVS has these alternate groups (I'm not really sure why...) with curly braces:

colo{u|}r

Which gives me the altogether different error of "Syntax error in pattern."... which, I don't see one. Basically, how do I do a ?? My actual input is not as simple as colour/color, otherwise I'd just fake it with (color|colour). I suppose could fake it, but it's an obtuse way to go about it.


Let's try alternation then...

Ok, I still can't do it, even with alternation. I have the following two regexes:

^[A-Z]+\t[0-9]+\t[^\t]+

^[A-Z]+\t[0-9]+\t[^\t]+\t[^\t]+

Those two match two sets of lines match, individually, my text. (The first one matches part of the lines that match the second one.)

My input is lines of currency information:

BZD 084 Belize dollar
CAD 124 Canadian dollar
CDF 976 Franc Congolais
CHE 947 WIR euro (complementary currency)
CHF 756 Swiss franc
CHW 948 WIR franc (complementary currency)
CLF 990 Unidad de Fomento (funds code)

(There are tabs, for example, between WIR euro and (complementary currency), but they're not always there.)

Logically, it should follow that to combine

^[A-Z]+\t[0-9]+\t[^\t]+

^[A-Z]+\t[0-9]+\t[^\t]+\t[^\t]+

..you get... ^[A-Z]+\t[0-9]+\t([^\t]+|[^\t]+\t[^\t]+) ...which somehow appears to be equivalent to the second expression in the first set.

+3  A: 

Our very own Jeff Atwood wrote about this a while back. Basically, Visual Studio's regex implementation is pretty nonstandard and there's no straightforward way to do what's usually done with '?'. You'll have to use your {colour|color} expression.

Welbog
Thanks... tried it with alternation, see edits.
Thanatos
A: 

Maybe this msdn article can help you

Eric
That article appears to be the same as the help file, which as I mentioned, is no help at all.
Thanatos
A: 

The regular expressions in Visual Studio find don't support ?. See the reference at MSDN. You best bet is probably the alternation character

tvanfosson
That article appears to be the same as the help file, which as I mentioned, is no help at all.
Thanatos
A: 

This works:

colo(u)|()r

for your real example, this will match each line:

^[A-Z]+:b[0-9]+:b[^\t]+(\t[^\t]+)|()
Greg
i think you meant "colo(u|())r". your original query reads "match 'colou' or 'r'"
Aaron
Both of our methods works exactly the same in Visual Studio. Try it yourself.
Greg
A: 

Did you try with the longest common path at the right?

For example, for colour and color it would be:

colo(ur|r)

and for your 2 regular expressions it would be:

^[A-Z]+\t[0-9]+\t{[^\t]+|[^\t]+\t[^\t]+}
eKek0
A: 

While not documented, I've found that the string "()" will match empty string. For example:

colo(u|())r

Likewise, try out

^[A-Z]+\t[0-9]+\t{[^\t]+(\t+[^\t]+|())}

With that last case, make sure to put the empty-string alternate last to avoid getting a partial match, or make sure to tack on an end-of-line token '$' as well.

Aaron