ansaurus

Question

Regexp match any uppercase characters except a particular string

Answer 1

+1 A:

Try:

(?<!A_)[a-zA-Z]+

(?!...) is called a negative lookbehind.

As for your specific problem, it's kind of cheating but try:

^([#\.]|(?<!A_))[A-Za-z]{2,}

I get:

fooBar => fooBar
foo Bar foo => foo
A_fooBar (no match)
fooBar /* Comment */ => fooBar
A_foobar (no match)
foo A_bar => foo
foobar => foobar
foo bar foo bar => foo
foobar /* Comment */ => foobar

cletus 2009-11-03 13:10:23

thanks, but i dont want to match [a-zA-Z]. This is what i have so far ^([A-Z]|[#.])[^{]*?(?<=[A-Z]) now i need to exclude any matches that have A_ as their only uppercase characters

Alan 2009-11-03 13:19:30

it matches CSS incase you are wondering

Alan 2009-11-03 13:20:17

That expression doesn't make a lot of sense. I'm running a little test and it only matches the ones with A at the front.

cletus 2009-11-03 13:24:27

the expression above only matches the uppercase characters, it checks the start of the line and allows for it to start with # or a . (as its CSS) [A-Z]|[#.]. Then it does anything other than a {, stops and looks behind for any uppercase characters. This works fine for me. The only complication i have now is preventing it from matching if it sees A_ when it looks back

Alan 2009-11-03 13:34:45

Answer 2

A:

This one does it, although the comment handling isn't extremely robust. (It assumes that a comment is always at the end of the line.)

.*((A(?!_)|([B-Z]))(?<!/\*.*)).*\r\n

Mike Hanson 2009-11-03 13:25:02

This looks pretty promising Mike, thanks. I think its falling down when there are multiple _, still looking into it

Alan 2009-11-03 13:46:56

Answer 3

+1 A:

Does it have to be a single regex? In perl, you could do something like:

if ($string =~ /[A-Z]/ && $string !~ /A_/)

Its not as cool as a single expression with lookback, but its probably easier to read and maintain.

SDGator 2009-11-03 13:30:55

thanks SDGator, i dont think i have the ability to do that

Alan 2009-11-03 13:48:04

Answer 4

+1 A:

My answer:

/([B-Z]|A[^_]|A$)/

I would remove the comment at an earlier stage, if at all possible.

Test:

#!perl
use warnings;
use strict;

my @matches = (
"fooBar",
"foo Bar foo",
"A_fooBar",
"fooBar /* Comment */");

my @nomatches = (
"A_foobar",
"foo A_bar",
"foobar",
"foo bar foo bar",
"foobar /* Comment */");

my $regex = qr/([B-Z]|A[^_]|A$)/;

for my $m (@matches) {
    $m =~ s:/\*.*$::;
    die "FAIL $m" unless $m =~ $regex;
}
for my $m (@nomatches) {
    $m =~ s:/\*.*$::;
    die "FAIL $m" unless $m !~ $regex;
}

Try it: http://codepad.org/EJhWtqkP

Kinopiko 2009-11-03 13:49:18

thanks Kinopiko, love the simplicity of your solution. I am writing expressions for use in static code analysis, so i wont actualy be removing anything. This is why i dont want to do a match inside a comment.

Alan 2009-11-03 14:07:53

Just copy the string and do a match on the copied one.

Kinopiko 2009-11-03 14:12:51

Answer 5

+1 A:

This should (also?) do it:

(?!A_)[A-Z](?!((?!/\*).)*\*/)

A short explanation:

(?!A_)[A-Z]     # if no 'A_' can be seen, match any uppercase letter
(?!             # start negative look ahead
  ((?!/\*).)    #   if no '/*' can be seen, match any character (except line breaks)
  *             #   match zero or more of the previous match
  \*/           #   match '*/'
)               # end negative look ahead

So, in plain English:

Match any uppercase except 'A_' and also not an uppercase if '*/' can be seen without first encountering '/*'.

Bart Kiers 2009-11-03 14:11:22

Answer 6

A:

Try this:

^(?:[^A-Z/]|A_|/(?!\*))*+[A-Z]

This will work in any flavor that supports possessive quantifiers, e.g. PowerGrep, Java and PHP. The .NET flavor doesn't, but it does support atomic groups:

^(?>(?:[^A-Z/]|A_|/(?!\*))*)[A-Z]

If neither of those features is available, you can use another lookahead to prevent it matching the A_ on the rebound:

^(?:[^A-Z/]|A_|/(?!\*))*(?!A_)[A-Z]

Alan Moore 2009-11-03 15:28:16

ansaurus

tags:

views:

answers:

Regexp match any uppercase characters except a particular string

related questions