tags:

views:

90

answers:

5

hi all

the following syntax is part of perl script with Irregular Expression as we see the target of the following syntax is to get VALID IP address as 123.33.44.5 or 255.255.0.0 etc

but how to change the following syntax if I want to valid also the IP:

for example:

  124.33.*.* 

(I want to valid also the * character as valid number 1-255)

example of valid IP's

*.1.23.4

123.2.*.*

*.*.*.*

*.*.198.20

example of not valid IP's

 123.**.23.3

 289.2.2.2

 21.*.*.1000

" *.*.*.**"
#

my orig code:

my $octet = qr/[01]?\d\d?|2[0-4]\d|25[0-5]/; 

my $ip = qr/ \b 
         (?!0+\.0+\.0+\.0+\b) 
         $octet(?:\.$octet){3} 
         \b 
       /x;
A: 

They are called regular expressions (not Irregular Expressions).

If you want a specific prefix, you could just say

my $octet = qr/[1-9][0-9]?|1[0-9][0-9]|2[0-4][0-9]|25[0-5]/
my $ip = qr/\b124[.]33[.]$octet[.]$octet\b/;

A warning about your regex though, as of Perl 5.8 \d no longer matches just [0-9]. Instead it matches and Unicode digit character, so the string "①.①.①.①" will match as will "᠐.᠐.᠐.᠐" (which is even worse, since that is 0.0.0.0 in Mongolian). Always use [0-9] instead of \d unless you want such matches.

Is this what you are looking for?

#!/usr/bin/perl

use strict;
use warnings;

sub wildcard_to_regex {
    my $wildcard = shift;
    my @octets   = split /[.]/, $wildcard;

    for my $octet (@octets) {
        next unless $octet eq "*";
        $octet = qr/[1-9][0-9]?|1[0-9][0-9]|2[0-4][0-9]/;
    }

    my $regex = '\b' . join("[.]", @octets) . '\b';
    return qr/$regex/;
}

for my $wildcard (qw/8.8.8.8 *.*.*.0 1.1.1.* 1.1.*.1/) {
    my $regex = wildcard_to_regex($wildcard);

    print "$wildcard\n";
    for my $test (qw/1.1.1.0 1.1.1.1 1.1.2.1/) {
        print "\t$test ",
            $test =~ $regex ? "matched" : "didn't match", "\n";
    }
}

It prints

8.8.8.8
        1.1.1.0 didn't match
        1.1.1.1 didn't match
        1.1.2.1 didn't match
*.*.*.0
        1.1.1.0 matched
        1.1.1.1 didn't match
        1.1.2.1 didn't match
1.1.1.*
        1.1.1.0 didn't match
        1.1.1.1 matched
        1.1.2.1 didn't match
1.1.*.1
        1.1.1.0 didn't match
        1.1.1.1 matched
        1.1.2.1 matched
Chas. Owens
@Chas - I think that's not what the user wanted. They want the string `124.33.*.*` to match the regex.
DVK
@DVK I read "I want to valid also the * character as valid number 1-255" as meaning he or she wants the place held by * to be a valid number between 1 and 255. Of course, that means my answer is still incomplete because the original `$octet` regex matches `0`.
Chas. Owens
so I will wait for you , I want to see the completed answer thanks for your help
jon
@jon I updated it right after I made that comment (I added `$octet` instead of having you use the version in your original code).
Chas. Owens
sorry but this is not the point , the target is to search in file system valid IP's and they can be also with star "*" as "*.123.3.3"
jon
+3  A: 

You have 2 problems:

  1. Need to add "*" to octet definition.

  2. Much worse - "*" matches word boundary (\w). So you should instead use explicit character class for ip-boundary: [^\d*]

    my $octet = qr/[01]?\d\d?|2[0-4]\d|25[0-5]|[*]/; 
    my $ip = qr/\b0+\.0+\.0+\.0+\b|(?:[^\d*]|^)$octet(?:[.]$octet){3}([^\d*]|$)/x;
    
    
    foreach $str (@ip_list) { 
        print "$str - ";
        print "NO " if $str !~ $ip;
        print "match\n";
    }
    

OUTPUT:

1.1.1.1 - match
123.1.*.* - match
1.*.3.4 - match
*.192.2.2 - match
23.*.3.3 - match
*.1.23.4 - match
123.2.*.* - match
*.*.*.* - match
*.*.198.20 - match

123.**.23.3 - NO match
289.2.2.2 - NO match
21.*.*.1000 - NO match
*.*.*.** - NO match

11.12.13.14 - match
1.*.3.4 - match
1.*.3.* - match
0.00.0.0 - match
DVK
@DVK: 1.*.3.4 its also valid IP , the "*" char can located as "*.xxx.xxx.xxx" as valid IP or as "XXX.*.xxx.xxx" as valid IP etc
jon
for example all the follwoing are valid as "*.12.3.4" or "87.*.3.3"
jon
the "*" char can be in between the . * . and this is valid
jon
@DVK can you make the little change in your code in order to support my needs?
jon
yes but I need also to support for example the "*.192.2.2" or "23.*.3.3" etc
jon
see my update question
jon
from some reason its not match the "*.10.10.10" why ?
jon
remark "*.10.10.10" ip address exist in file!
jon
I found the problem - `\b` in the original regex was the culprit. Fixed.
DVK
what to say exelent work , I dont have words to say like to tauch the sky ....................very very good
jon
how many years you write perl?
jon
your solution = 1000$
jon
You are welcome. Too many years to count :)
DVK
+1  A: 

Why do you need to do this? you should instead use proper CIDR notation, e.g. 124.33/16, and then you can use standard Net::IP::* modules to handle the IP ranges.

Ether
because on my system files I have also the IP as example 128.2.*.*
jon
the target of my code is to match all valid IP's include IP's with *
jon
please give me feedback if you understand my answer
jon
@jon: my point is that your system files should not specify IP address ranges in this manner - it is non-standard.
Ether
I see but our file configuration is fit for our needs so its not meter if it standard or not , and we cant change files rules ,
jon
So subclass `Net::IP`, to accept this format. It isn't hard. `my $count = ( $ip =~ s/(?:\.\*)//g ); $ip .= '/' . (8*$count) if $count`
Evan Carroll
A: 

This should be obvious, but subclass Net::IP. Here I subclass it to SillyIP, and wrap the set function. In real life I'd probably subclass this to Net::IP::SillyIP though.

package SillyIP;
use Moose;

extends 'Net::IP';

around 'set' => sub {
  my ( $orig , $self, $ip, @args ) = @_;

  die "invalid IP" if $ip =~ /\*{2}|\s/;

  if ( $ip =~ /\.\*/ ) {
    my $count = ( $ip =~ s/(\.\*)+$/.0/g );
    $ip .= '/' . (abs(4- $count)*8);
  }

  $self->$orig( $ip, @args );

};

1;

package main;

use Test::More tests => 5;

eval { SillyIP->new('10.**.1.1') };
ok ( $@, 'Fail to accept **' );

eval { SillyIP->new(' 10.0.1.1 ') };
ok ( $@, 'Fail to accept spaces in ip' );

is ( SillyIP->new('10.*.*.*')->ip, SillyIP->new('10/8')->ip, '8 bit network' );
is ( SillyIP->new('192.168.*.*')->ip, SillyIP->new('192.168/16')->ip, '16 bit network' );
is ( SillyIP->new('192.168.1.*')->ip, SillyIP->new('192.168.1/24')->ip, '24 bit network' );

This will provide 90% of what you're asking. It doesn't however accept * as a range for digit. This is because ipv4 addresses aren't decimal. They're actually just a 32bit data structures, which can be displayed as a.b.c.d so long as {a,b,c,d} is in range of 1-255 (8 bits). This means that asking *.1.2.3, to represent 1.2.3.4, and 2.2.3.4, but not 1.2.3.5 has no technical merit whatsoever. There is no reason to ever need this. But you could accomplish it with a quick binary algorithm.

Evan Carroll
@Evan - TIMTOWTDT and all, but this introduces a new class, imports Moose, AND Net::IP; and has 11 new lines of code. Versus 2 lines of code when doing it via RegEx (4 lines if you want to wrap the regex into a sub)... Wee bit of an overkill, IMHO, while not having ANY of the benefits of pre-canned CPAN library (e.g. widely tested/reliable) due to having roll-your-own code anyway.
DVK
Well, you're categorically wrong here. Subclassing claims **ALL** of the benefits of the module you're sub-classing, less any problems you introduce -- to claim otherwise just shows a profound misunderstanding of OO. Your solution can lead one to believe that it is ok to treat non-strings, as strings which just isn't so. It isn't even OK to treat HTML -- which is kind of a string -- as a string. Regexes don't belong in some places -- this is a great example of where you should stay away from them. And, SLOC has nothing to do with *overkill*: you're the one reimplimenting IP validation.
Evan Carroll
+1  A: 
#!/usr/bin/perl

use strict;
use warnings;

sub is_ip_with_wildcards {
    my ($ip) = @_;
    my @octets = split / [.] /xms, $ip;    
    return 4 == grep { $_ eq q{*} || m/ \A [0-9]+ \z /xms && $_ < 256 } @octets;
}

while( defined( my $line = <> ) ) {
    if( my @ips = grep { is_ip_with_wildcard( $_ ) } split q{ }, $line ) {
        print 'found IPs: ', join(q{, }, @ips), "\n";
    }
}
Daniel Holz
@Daniel - +1,... I LIKE the idea of not being stuck in RegEx land :)
DVK