tags:

views:

181

answers:

7

Ok, so this is something completely stupid but this is something I simply never learned to do and its a hassle.

How do I specify a string that does not contain a sequence of other characters. For example I want to match all lines that do NOT end in '.config'

I would think that I could just do

.*[^(\.config)]$

but this doesn't work (why not?)

I know I can do

.*[^\.][^c][^o][^n][^f][^i][^g]$

but please please please tell me that there is a better way

+3  A: 
(?<!\.config)$

:)

watain
+9  A: 

You can use negative lookbehind, e.g.:

.*(?<!\.config)$

This matches all strings except those that end with ".config"

Manu
This works but .*(?!=\.config)$ does not - I thought the two syntaxes were equivalent. Any clue?
George Mauer
They are NOT equivalent. (?<!) matches the preceding string (look behind), while (?!) matches the following string (look ahead)
Manu
No, they are not. Negative lookahead is `(?!matchthis)`, and your example can't work because you're looking ahead at a moment when you're already at the end of the string (`$`).
Tim Pietzcker
+1 thanks for the [link](http://www.regular-expressions.info/lookaround.html).
Lazer
Also, [why negating regex is "difficult"](http://www.perlmonks.org/?node_id=588315#588368).
Lazer
A: 

This will help you on generating regular expressions. :)

http://www.gskinner.com/RegExr/desktop/ (there's also a web version)

Alfabravo
+2  A: 

Unless you are "grepping" ... since you are not using the result of a match, why not search for the strings that do end in .config and skip them? In Python:

import re
isConfig = re.compile('\.config$')
# List lst is given
filteredList = [f.strip() for f in lst if not isConfig.match(f.strip())]

I suspect that this will run faster than a more complex re.

Hamish Grubijan
Unless you are grepping, why use regex at all? Python has `in` for a reason. Other languages I'm sure have similar solutions.
Daniel Straight
Yeah this is what I do now, but it is best to know how to do it both ways. I've run into situations where this has forced some awkward syntax.
George Mauer
+1  A: 

By using the [^] construct, you have created a negated character class, which matches all characters except those you have named. Order of characters in the candidate match do not matter, so this will fail on any string that has any of [(\.config) (or [)gi.\onc(])

Use negative lookahead, (with perl regexs) like so: (?!\.config$). This will match all strings that do not match the literal ".config"

Andrew
+3  A: 

Your question contains two questions, so here are a few answers.

Match lines that don't contain a certain string (say .config) at all:

^(?:(?!\.config).)*$\r?\n?

Match lines that don't end in a certain string:

^.*(?<!\.config)$\r?\n?

and, as a bonus: Match lines that don't start with a certain string:

^(?!\.config).*$\r?\n?

(each time including newline characters, if present.

Oh, and to answer why your version doesn't work: [^abc] means "any one (1) character except a, b, or c". Your other solution would also fail on test.hg (because it also ends in the letter g - your regex looks at each character individually instead of the entire .config string. That's why you need lookaround to handle this.

Tim Pietzcker
+2  A: 

As you have asked for a "better way": I would try a "filtering" approach. I think it is quite easy to read and to understand:

#!/usr/bin/perl

while(<>) {
    next if /\.config$/; # ignore the line if it ends with ".config"
    print;
}

As you can see I have used perl code as an example. But I think you get the idea?

added: this approach could also be used to chain up more filter patterns and it still remains good readable and easy to understand,

    next if /\.config$/; # ignore the line if it ends with ".config"
    next if /\.ini$/;    # ignore the line if it ends with ".ini"
    next if /\.reg$/;    # ignore the line if it ends with ".reg"

    # now we have filtered out all the lines we want to skip
    ... process only the lines we want to use ...