tags:

views:

293

answers:

5

I have array which values are user input like:

aa df rrr5 4323 54 hjy 10 gj @fgf %d

Now I want to check each value in array to see whether it's numeric, alphabetic (a-zA-Z), or alphanumeric and save them in other respective arrays.

I have done:

my @num;
my @char;
my @alphanum;

my $str =<>;
  my @temp = split(" ",$str);
        foreach (@temp)
           {
                print "input : $_ \n";
                if ($_ =~/^(\d+\.?\d*|\.\d+)$/)
                    {
                        push(@num,$_);
                    }
           }

This works. Similarly I want to check for alphabet, and alphanumeric values

Alphanumeric example are: fr43 6t$ $eed5 *jh

A: 

Alphabet:

 /^[a-z]+$/i

What most people mean by alphanumeric:

 /^[a-z0-9]+$/i

BUT WAIT:

note: alphanumeric ex. fr43 6t$ $eed5 *jh

I didn't understand this, but judging from your comment below and the quote above, what you mean by alphanumeric might be achieved by

 /^[[:graph:]]+$/

That matches any printable ASCII characters except spaces.

Hope this has solved your problem.

Kinopiko
@Kinopiko: note was to indicate special characters are in alphanumeric and not in alphabetic
dexter
I took that as anything that isn't numeric or alphabetic gets classified as alphanumeric, regardless of what characters there are.
ysth
+6  A: 

Perl supports POSIX character classes, so you can actually do this:

$string =~ /^[[:alpha:]]+$/;
$string =~ /^[[:alnum:]]+$/;

Numbers are less well defined, but Scalar::Util's looks_like_number function may do what you want it to do.

Leon Timmermans
? that checks whether string is a single alpha or alnum character (possibly with a following \n)
ysth
Sorry about that, I screwed up my edit. It's fixed now.
Leon Timmermans
+2  A: 

The answer you accepted doesn't produce the results which you claim to want in your question. Specifically, the POSIX character class [:alphanum:] will not match punctuation characters meaning that 6t$ $eed5 *jh will not be matched. In order to match punctuation characters you need to add [:punct:] to the char class. See the Regex cheat sheet.

So for example if you have the file tokens.txt which contains:

aa df rrr5 4323 54 hjy 10 gj @fgf %d fr43 6t$ $eed5 *jh

And you run this perl script:

#!/usr/bin/perl -w
use warnings;
use diagnostics;
use strict;
use Scalar::Util qw( looks_like_number );


my $str =<>;
my @temp = split(" ",$str);

my @num = grep { looks_like_number($_) } @temp;
my @char = grep /^[[:alpha:]]+$/, @temp;
my @alphanum = grep /^[[:alnum:][:punct:]]+$/, @temp;

print "Numbers: " . join(' ', @num) . "\n";
print "Alpha: " . join(' ', @char) . "\n";
print "Alphanum: " . join(' ', @alphanum) . "\n";

like this:

cat tokens.txt | ./tokenize.pl

You get the output:

Numbers: 4323 54 10
Alpha: aa df hjy gj
Alphanum: aa df rrr5 4323 54 hjy 10 gj @fgf %d fr43 6t$ $eed5 *jh

However, it seems by your question that you don't want to match all punctuation characters such as @ and %, but instead only certain ones such as $ and *.

If that's the case then you just change the Alphanum match to:

my @alphanum = grep /^[[:alnum:]\$\*]+$/, @temp;

Which will then give you the desired output of

Numbers: 4323 54 10
Alpha: aa df hjy gj
Alphanum: aa df rrr5 4323 54 hjy 10 gj fr43 6t$ $eed5 *jh

Robert S. Barnes
+1  A: 

For separating the input into arrays something like this would work and allow easy additions or changes to your matches.

my $input = 'aa df rrr5 4323 54 hjy 10 gj @fgf %d';
my %tests = ( 
    num   => '\d+',
    alpha => '[[:alpha:]]+', 
    alnum => '[[:alnum:]]+' 
);

my %res;
for my $t (keys %tests) {
    for (split(' ', $input)) {
        push(@{ $res{$t} }, $_) if (/^$tests{$t}$/i);
    }
}
Ryan Zachry
A: 

If you want to recognize all valid numbers (scientific/fixed/... notation), you can let Perl do the work like this:

sub test_num {
    no warnings "all";
    $b = "$_[0]"; 
    $a = $b + 0; 
    return ($a eq $b);
}
push(@num, $tmp) if (test_num($tmp));

(The reason for the line $b = "$_[0]"; is that otherwise the original variable - $tmp - is brought into numeric context inside test_num function - a bit of undesired side-effect)

Anatoli