tags:

views:

120

answers:

4

I am pretty new to regular expressions. I want to write a regular expression which validates whether the given string has only certain characters. If the string has any other characters than these it should not be matched.

The characters I want are:

 & ' : , / - ( ) . # " ; A-Z a-z 0-9
+1  A: 

/\A[A-Za-z0-9&':,\().#";-]+\z/

Those so called special characters are not special in a character class.

Sinan Ünür
+1  A: 

Try this:

$val =~ m/^[&':,\/\-().#";A-Za-z0-9]+$/;

$val will match if it has at least one character and consists entirely of characters in that character set. An empty string will not be matched (if you want an empty string to match, change the last + to a *).

You can test it out yourself:

# Here's the file contents. $ARGV[0] is the first command-line parameter.
# We print out the matched text if we have a match, or nothing if we don't.
[/tmp]> cat regex.pl
$val = $ARGV[0];
print ($val =~ m/^[&':,\/\-().#";A-Za-z0-9]+$/g);
print "\n";

Some examples:

# Have to escape ( and & in the shell, since they have meaning.
[/tmp]> perl regex.pl a\(bc\&
a(bc&

[/tmp]> perl regex.pl abbb%c


[/tmp]> perl regex.pl abcx
abcx

[/tmp]> perl regex.pl 52
52

[/tmp]> perl regex.pl 5%2
John Feminella
Whoops -- thanks, @FM. Fixed.
John Feminella
Thanks dude that worked perfect. Thanks a lot
Teja Kantamneni
A: 

/^[&':,/-().#";A-Za-z0-9]*$/

DNNX
@DNNX Did you test this?
Sinan Ünür
+1  A: 

There are two main approaches to construct a regular expression for this purpose. First is to make sure that all symbols are allowed. Another is to make sure that no symbols are not allowed. And you can also use the transliteration operator instead. Here's a benchmark:

use Benchmark 'cmpthese';

my @chars = ('0' .. '9', 'A' .. 'Z', 'a' .. 'z');
my $randstr = map $chars[rand @chars], 1 .. 16;
sub nextstr() { return $randstr++ }

cmpthese 1000000, {
    regex1 => sub { nextstr =~ /\A["#&'(),\-.\/0-9:;A-Za-z]*\z/ },
    regex2 => sub { nextstr !~ /[^"#&'(),\-.\/0-9:;A-Za-z]/ },
    tr     => sub { (my $dummy = nextstr) !~ y/"#&'(),\-.\/0-9:;A-Za-z/"#&'(),\-.\/0-9:;A-Za-z/c },
};

Results:

           Rate regex1 regex2     tr
regex1 137552/s     --   -41%   -60%
regex2 231481/s    68%     --   -32%
tr     341297/s   148%    47%     --
codeholic
+1, this is an example of a very good answer! I had a pleasure to read it even though it wasn't my question and I knew the answer myself :)
Igor Korkhov
Remember to post the perl and machine you use with any benchmark. Those results aren't portable. :)
brian d foy