The 'a little bit faster' comment is accurate in that there is a little less bookkeeping to be done, but the emphasis is on 'little bit' rather than 'faster'. Basically, normally, the material matched by \(pattern\)
has to be kept so that you can use \3
(for the appropriate number) to refer to it in the replacement. The %
notation means that vim
does not have to keep track of the match - so it is doing a little less work.
@SimpleQuestions asks:
What do you mean by "keep track of the match"? How does it affect speed?
You can use escaped parentheses to 'capture' parts of the matched pattern. For example, suppose we're playing with simple C function declarations - no pointers to functions or other sources of parentheses - then we might have a substitute command such as the following:
s@\<\([a-zA-Z_][a-zA-Z_0-9]*\)(\([^)]*\))@xyz_\1(int nargs) /* \2 */@
Given an input line such as:
int simple_function(int a, char *b, double c)
The output will be:
int xyz_simple_function(int nargs) /* int a, char *b, double c */
(Why might you want to do that? I'm imagining that I need to wrap the C function simple_function
so that it can be called from a language compiled to C that uses a different interface convention - it is based on Informix 4GL, to be precise. I'm using it to get an example - not because you really need to know why it was a good change to make.)
Now, in the example, the \1
and \2
in the replacement text refer to the captured parts of the regular expression - the function name (a sequence of alphanumerics starting with an alphabetic character - counting underscore as 'alphabetic') and the function argument list (everything between the parentheses, but not including the parentheses).
If I'd used the \%(....\)
notation around the function identifier, then \1
would refer to the argument list and there would be no \2
. Because vim
would not have to keep track of one of the two captured parts of the regular expression, it has marginally less bookkeeping to do than if it had to keep track of two captured parts. But, as I said, the difference is tiny; you could probably never measure it in practice. That's why the manual says 'it allows more groups'; if you needed to group parts of your regular expression but didn't need to refer to them again, then you could work with longer regular expressions. However, by the time you have more than 9 remembered (captured) parts to the regular expression, your brain is usually doing gyrations and your fingers will make mistakes anyway - so the effort is not usually worth it. But that is, I think, the argument for using the \%(...\)
notation. It matches the Perl (PCRE) notation '(?:...)
' for a non-capturing regular expression.