Based on the lack of a capturing group around the year, I assume you care only whether a date matches.
I tried a few different patterns related to the one from your question, and the one that gave a ten- to fifteen-percent improvement was disabling capturing, i.e.,
/\d{4}(?:-\d{2}(?:-\d{2}(?: \d{2}(?::\d{2}(?::\d{2})?)?)?)?)?/
The perlre documentation covers (?:...)
:
(?:pattern)
(?imsx-imsx:pattern)
This is for clustering, not capturing; it groups subexpressions like ()
, but doesn't make backreferences as ()
does. So
@fields = split(/\b(?:a|b|c)\b/)
is like
@fields = split(/\b(a|b|c)\b/)
but doesn't spit out extra fields. It's also cheaper not to capture characters if you don't need to.
Any letters between ?
and :
act as flags modifiers as with (?imsx-imsx)
. For example,
/(?s-i:more.*than).*million/i
is equivalent to the more verbose
/(?:(?s-i)more.*than).*million/i
Benchmark output:
Rate U U/NC CH/NC/A CH/NC/A/U CH CH/NC null
U 31811/s -- -32% -58% -59% -61% -66% -93%
U/NC 46849/s 47% -- -38% -39% -42% -50% -90%
CH/NC/A 76119/s 139% 62% -- -1% -6% -18% -84%
CH/NC/A/U 76663/s 141% 64% 1% -- -6% -17% -84%
CH 81147/s 155% 73% 7% 6% -- -13% -83%
CH/NC 92789/s 192% 98% 22% 21% 14% -- -81%
null 481882/s 1415% 929% 533% 529% 494% 419% --
Code:
#! /usr/bin/perl
use warnings;
use strict;
use Benchmark qw/ :all /;
sub option_chain {
local($_) = @_;
/\d{4}(-\d{2}(-\d{2}( \d{2}(:\d{2}(:\d{2})?)?)?)?)?/
}
sub option_chain_nocap {
local($_) = @_;
/\d{4}(?:-\d{2}(?:-\d{2}(?: \d{2}(?::\d{2}(?::\d{2})?)?)?)?)?/
}
sub option_chain_nocap_anchored {
local($_) = @_;
/\A\d{4}(?:-\d{2}(?:-\d{2}(?: \d{2}(?::\d{2}(?::\d{2})?)?)?)?)?\z/
}
sub option_chain_anchored_unrolled {
local($_) = @_;
/\A\d\d\d\d(-\d\d(-\d\d( \d\d(:\d\d(:\d\d)?)?)?)?)?\z/
}
sub simple_split {
local($_) = @_;
split /[ :-]/;
}
sub unrolled {
local($_) = @_;
grep defined($_), /\A (\d\d\d\d)-(\d\d)-(\d\d) (\d\d):(\d\d):(\d\d) \z
|\A (\d\d\d\d)-(\d\d)-(\d\d) (\d\d):(\d\d) \z
|\A (\d\d\d\d)-(\d\d)-(\d\d) (\d\d) \z
|\A (\d\d\d\d)-(\d\d)-(\d\d) \z
|\A (\d\d\d\d)-(\d\d) \z
|\A (\d\d\d\d) \z
/x;
}
sub unrolled_nocap {
local($_) = @_;
grep defined($_), /\A \d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d \z
|\A \d\d\d\d-\d\d-\d\d \d\d:\d\d \z
|\A \d\d\d\d-\d\d-\d\d \d\d \z
|\A \d\d\d\d-\d\d-\d\d \z
|\A \d\d\d\d-\d\d \z
|\A \d\d\d\d \z
/x;
}
sub id { $_[0] }
my @examples = (
"xyz",
"2010",
"2010-08",
"2010-08-27",
"2010-08-27 02",
"2010-08-27 02:11",
"2010-08-27 02:11:36",
);
cmpthese -1 => {
"CH" => sub { option_chain $_ for @examples },
"CH/NC" => sub { option_chain_nocap $_ for @examples },
"CH/NC/A" => sub { option_chain_nocap_anchored $_ for @examples },
"CH/NC/A/U" => sub { option_chain_anchored_unrolled $_ for @examples },
"U" => sub { unrolled $_ for @examples },
"U/NC" => sub { unrolled_nocap $_ for @examples },
"null" => sub { id $_ for @examples },
};