views:

10648

answers:

6

I'm using RegexBuddy but I'm in trouble anyway with this thing :\

I'm processing line by line a file. I built a "line model" to match what I want.

Now i'd like to do an inverse match... i.e. I want to match lines where there is a string of 6 letters, but only if these six letters are not Andrea, how should I do that?


EDIT: I'll write the program that uses this regex, I don't know yet if in python or php, I'm doing this thing first to learn some regex :) There are different types of line, I wanted to use regex to select the type i'm interested in. Once I got these lines I've to apply an other filter just to do not match a known value, I need all the others, not that. The (?!not-wanted) is working pretty fine, thank you :-)

I hope this clarifies the question :)

+11  A: 
(?!Andrea).{6}

Assuming your regexp engine supports negative lookaheads..

Edit: ..or maybe you'd prefer to use [A-Za-z]{6} in place of .{6}

Edit (again): Note that lookaheads and lookbehinds are generally not the right way to "inverse" a regular expression match. Regexps aren't really set up for doing negative matching, they leave that to whatever language you are using them with.

Dan
You need to add the ^ that @Vinko Vrsalovic uses so that it won't match on "ndrea\n"
bdukes
. doesn't match \n by default (some languages [eg Perl] allow you to switch on that behaviour, but by default . matches everything BUT \n).
Dan
(plus, the OP never mentioned the string had to occur at the start of the line)
Dan
what do you mean for OP?
Andrea Ambu
Andrea: OP means "original poster", so, I was referring to you :)
Dan
Dan: ok i did not learn the SO slang yet :P Thank you :) The same thing is commented on the Vinko Vrsalovic answer
Andrea Ambu
+2  A: 

Negative lookahead assertion

(?!Andrea)

This is not exactly an inverted match, but it's the best you can directly do with regex. Not all platforms support them though.

Vinko Vrsalovic
Until the questioner clarifies, I don't see that the match has to start at the start of the line. So why the ^ ?
Hamish Downer
mish is perfectly right
Andrea Ambu
Because I understood he wanted to check at the beginning of the line, edited given clarifications
Vinko Vrsalovic
+2  A: 

What language are you using? The capabilities and syntax of the regex implementation matter for this.

You could use look-ahead. Using python as an example

import re

not_andrea = re.compile('(?!Andrea)\w{6}', re.IGNORECASE)

To break that down:

(?!Andrea) means 'match if the next 6 characters are not "Andrea"'; if so then

\w means a "word character" - alphanumeric characters. This is equivalent to the class [a-zA-Z0-9_]

\w{6} means exactly 6 word characters.

re.IGNORECASE means that you will exclude "Andrea", "andrea", "ANDREA" ...

Another way is to use your program logic - use all lines not matching Andrea and put them through a second regex to check for 6 characters. Or first check for at least 6 word characters, and then check that it does not match Andrea.

Hamish Downer
A: 

In perl you can do

process($line) if ($line =~ !/Andrea/);

phreakre
That syntax is wrong. I think you mean process($line) if $line !~ /Andrea/
dland
+1  A: 

If you want to do this in RegexBuddy, there are two ways to get a list of all lines not matching a regex.

On the toolbar on the Test panel, set the test scope to "Line by line". When you do that, an item List All Lines without Matches will appear under the List All button on the same toolbar. (If you don't see the List All button, click the Match button in the main toolbar.)

On the GREP panel, you can turn on the "line-based" and the "invert results" checkboxes to get a list of non-matching lines in the files you're grepping through.

Jan Goyvaerts
+3  A: 

Python/java

^(.(?!(some text)))*$

http://www.lisnichenko.com/articles/javapython-inverse-regex.html

Dmytro