Recently I discovered two amazing regular expression features: ?:
and ?!
. I was curious of other neat regex features. So maybe you would like to share some tricky regular expressions.
views:
359answers:
9I think the entire regular-expressions.info site is a good, if not so "secret", trick. :) It has those, on the advanced page.
Well it's up to you to decide what's rare. Get a program like RegexBuddy which has dropdownlists from which you can build expressions by specifying different criteria, and see if there's anything in those lists that you haven't heard of before =)
Did you know, say, that you can have named capturing groups? Such as
(?<Awesome>.*?)
Would actually be fetched with 'Awesome' rather than a zero-based index.
Other than that, I'll add that your second example is negative lookahead. It says that the string that follows must definitely not be 'dog'. So "my dog likes green birds" would not match. But perhaps that's what you meant. I thought that was a bit unclear, from reading your post =)
Not a secret regex trick but a good recommendation is the book Regular Expressions Cookbook by O'Reilly http://www.amazon.com/dp/0596520689
In vim, this line will remove all XML comments, single or multi line:
:%s/<!--\_.\{-}-->//g
The \_.
is like a dot that matches newlines too. The \{-}
is the non-greedy star, like *?
in sed.
Your discoveries are non-capturing groups (?:...)
and negative look-ahead assertions (?!...)
. There aren't any "secret" regex tricks, but there are many features that you may not know about. I recommend a thorough reading of perlre.
It helps to test your code before you post it. I ran, in Perl:
if ( "My cat likes green birds" =~ m/My (?!dog) likes .+ / ) {
print( "Match => \"$1\"\n" );
} else {
print( "No match\n" );
}
and it output No match
. On the other hand:
if ( "My cat likes green birds" =~ m/My (?!dog)(.+?) likes .+ / ) {
print( "Match => \"$1\"\n" );
} else {
print( "No match\n" );
}
outputs Match => "cat"
.
Try your code sometimes. You'll be amazed at how much a test run clears up your understanding of a topic.
I wouldn't call them secret.
If you're serious in learning regex, the (already mentioned) on-line resource http://www.regular-expressions.info should be in your bookmarks, and Friedl's Mastering Regular Expressions (Third Edition) should be on your bookshelf.
Not so secret you can test your regexes online at
- regex.larsolavtorvik.com
- for Java 1.4+ myregexp.com
- for Ruby 1.8 rubular.com
I guess it is all a secret if you never look at the docs installed on your computer along with Perl:
Start with
$ perldoc perlre
There is no need for the rest of us to post bits and pieces of the docs here as answers. Besides, your explanations of both patterns are wrong:
# (?:pattern)
# (?imsx-imsx:pattern)
This is for clustering, not capturing; it groups subexpressions like "()", but doesn't make backreferences as "()" does.
(?!pattern)
A zero-width negative look-ahead assertion. For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar". Note however that look-ahead and look-behind are NOT the same thing.