ansaurus

Question

perl + Irregular Expression (VALID IP + ADD Valid Rule)

Answer 1

A:

They are called regular expressions (not Irregular Expressions).

If you want a specific prefix, you could just say

my $octet = qr/[1-9][0-9]?|1[0-9][0-9]|2[0-4][0-9]|25[0-5]/
my $ip = qr/\b124[.]33[.]$octet[.]$octet\b/;

A warning about your regex though, as of Perl 5.8 \d no longer matches just [0-9]. Instead it matches and Unicode digit character, so the string "①.①.①.①" will match as will "᠐.᠐.᠐.᠐" (which is even worse, since that is 0.0.0.0 in Mongolian). Always use [0-9] instead of \d unless you want such matches.

Is this what you are looking for?

#!/usr/bin/perl

use strict;
use warnings;

sub wildcard_to_regex {
    my $wildcard = shift;
    my @octets   = split /[.]/, $wildcard;

    for my $octet (@octets) {
        next unless $octet eq "*";
        $octet = qr/[1-9][0-9]?|1[0-9][0-9]|2[0-4][0-9]/;
    }

    my $regex = '\b' . join("[.]", @octets) . '\b';
    return qr/$regex/;
}

for my $wildcard (qw/8.8.8.8 *.*.*.0 1.1.1.* 1.1.*.1/) {
    my $regex = wildcard_to_regex($wildcard);

    print "$wildcard\n";
    for my $test (qw/1.1.1.0 1.1.1.1 1.1.2.1/) {
        print "\t$test ",
            $test =~ $regex ? "matched" : "didn't match", "\n";
    }
}

It prints

8.8.8.8
        1.1.1.0 didn't match
        1.1.1.1 didn't match
        1.1.2.1 didn't match
*.*.*.0
        1.1.1.0 matched
        1.1.1.1 didn't match
        1.1.2.1 didn't match
1.1.1.*
        1.1.1.0 didn't match
        1.1.1.1 matched
        1.1.2.1 didn't match
1.1.*.1
        1.1.1.0 didn't match
        1.1.1.1 matched
        1.1.2.1 matched

Chas. Owens 2010-09-07 19:02:00

@Chas - I think that's not what the user wanted. They want the string `124.33.*.*` to match the regex.

DVK 2010-09-07 19:03:30

@DVK I read "I want to valid also the * character as valid number 1-255" as meaning he or she wants the place held by * to be a valid number between 1 and 255. Of course, that means my answer is still incomplete because the original `$octet` regex matches `0`.

Chas. Owens 2010-09-07 19:10:00

so I will wait for you , I want to see the completed answer thanks for your help

jon 2010-09-07 19:26:24

@jon I updated it right after I made that comment (I added `$octet` instead of having you use the version in your original code).

Chas. Owens 2010-09-07 19:47:38

sorry but this is not the point , the target is to search in file system valid IP's and they can be also with star "*" as "*.123.3.3"

jon 2010-09-07 20:19:16

Answer 2

+3 A:

You have 2 problems:

Need to add "*" to octet definition.

Much worse - "*" matches word boundary (\w). So you should instead use explicit character class for ip-boundary: [^\d*]

my $octet = qr/[01]?\d\d?|2[0-4]\d|25[0-5]|[*]/; 
my $ip = qr/\b0+\.0+\.0+\.0+\b|(?:[^\d*]|^)$octet(?:[.]$octet){3}([^\d*]|$)/x;


foreach $str (@ip_list) { 
    print "$str - ";
    print "NO " if $str !~ $ip;
    print "match\n";
}

OUTPUT:

1.1.1.1 - match
123.1.*.* - match
1.*.3.4 - match
*.192.2.2 - match
23.*.3.3 - match
*.1.23.4 - match
123.2.*.* - match
*.*.*.* - match
*.*.198.20 - match

123.**.23.3 - NO match
289.2.2.2 - NO match
21.*.*.1000 - NO match
*.*.*.** - NO match

11.12.13.14 - match
1.*.3.4 - match
1.*.3.* - match
0.00.0.0 - match

DVK 2010-09-07 19:02:42

@DVK: 1.*.3.4 its also valid IP , the "*" char can located as "*.xxx.xxx.xxx" as valid IP or as "XXX.*.xxx.xxx" as valid IP etc

jon 2010-09-07 20:13:25

for example all the follwoing are valid as "*.12.3.4" or "87.*.3.3"

jon 2010-09-07 20:20:59

the "*" char can be in between the . * . and this is valid

jon 2010-09-07 20:22:02

@DVK can you make the little change in your code in order to support my needs?

jon 2010-09-07 20:29:42

yes but I need also to support for example the "*.192.2.2" or "23.*.3.3" etc

jon 2010-09-07 20:37:27

see my update question

jon 2010-09-07 20:41:26

from some reason its not match the "*.10.10.10" why ?

jon 2010-09-07 20:49:38

remark "*.10.10.10" ip address exist in file!

jon 2010-09-07 21:01:34

I found the problem - `\b` in the original regex was the culprit. Fixed.

DVK 2010-09-07 21:04:26

what to say exelent work , I dont have words to say like to tauch the sky ....................very very good

jon 2010-09-07 21:12:25

how many years you write perl?

jon 2010-09-07 21:13:17

your solution = 1000$

jon 2010-09-07 21:16:25

You are welcome. Too many years to count :)

DVK 2010-09-08 01:34:26

Answer 3

+1 A:

Why do you need to do this? you should instead use proper CIDR notation, e.g. 124.33/16, and then you can use standard Net::IP::* modules to handle the IP ranges.

Ether 2010-09-07 19:19:24

because on my system files I have also the IP as example 128.2.*.*

jon 2010-09-07 19:22:28

the target of my code is to match all valid IP's include IP's with *

jon 2010-09-07 19:23:11

please give me feedback if you understand my answer

jon 2010-09-07 19:24:30

@jon: my point is that your system files should not specify IP address ranges in this manner - it is non-standard.

Ether 2010-09-07 19:27:06

I see but our file configuration is fit for our needs so its not meter if it standard or not , and we cant change files rules ,

jon 2010-09-07 19:30:38

So subclass `Net::IP`, to accept this format. It isn't hard. `my $count = ( $ip =~ s/(?:\.\*)//g ); $ip .= '/' . (8*$count) if $count`

Evan Carroll 2010-09-07 20:21:17

Answer 4

A:

This should be obvious, but subclass Net::IP. Here I subclass it to SillyIP, and wrap the set function. In real life I'd probably subclass this to Net::IP::SillyIP though.

package SillyIP;
use Moose;

extends 'Net::IP';

around 'set' => sub {
  my ( $orig , $self, $ip, @args ) = @_;

  die "invalid IP" if $ip =~ /\*{2}|\s/;

  if ( $ip =~ /\.\*/ ) {
    my $count = ( $ip =~ s/(\.\*)+$/.0/g );
    $ip .= '/' . (abs(4- $count)*8);
  }

  $self->$orig( $ip, @args );

};

1;

package main;

use Test::More tests => 5;

eval { SillyIP->new('10.**.1.1') };
ok ( $@, 'Fail to accept **' );

eval { SillyIP->new(' 10.0.1.1 ') };
ok ( $@, 'Fail to accept spaces in ip' );

is ( SillyIP->new('10.*.*.*')->ip, SillyIP->new('10/8')->ip, '8 bit network' );
is ( SillyIP->new('192.168.*.*')->ip, SillyIP->new('192.168/16')->ip, '16 bit network' );
is ( SillyIP->new('192.168.1.*')->ip, SillyIP->new('192.168.1/24')->ip, '24 bit network' );

This will provide 90% of what you're asking. It doesn't however accept * as a range for digit. This is because ipv4 addresses aren't decimal. They're actually just a 32bit data structures, which can be displayed as a.b.c.d so long as {a,b,c,d} is in range of 1-255 (8 bits). This means that asking *.1.2.3, to represent 1.2.3.4, and 2.2.3.4, but not 1.2.3.5 has no technical merit whatsoever. There is no reason to ever need this. But you could accomplish it with a quick binary algorithm.

Evan Carroll 2010-09-07 21:35:51

@Evan - TIMTOWTDT and all, but this introduces a new class, imports Moose, AND Net::IP; and has 11 new lines of code. Versus 2 lines of code when doing it via RegEx (4 lines if you want to wrap the regex into a sub)... Wee bit of an overkill, IMHO, while not having ANY of the benefits of pre-canned CPAN library (e.g. widely tested/reliable) due to having roll-your-own code anyway.

DVK 2010-09-08 04:26:47

Well, you're categorically wrong here. Subclassing claims **ALL** of the benefits of the module you're sub-classing, less any problems you introduce -- to claim otherwise just shows a profound misunderstanding of OO. Your solution can lead one to believe that it is ok to treat non-strings, as strings which just isn't so. It isn't even OK to treat HTML -- which is kind of a string -- as a string. Regexes don't belong in some places -- this is a great example of where you should stay away from them. And, SLOC has nothing to do with *overkill*: you're the one reimplimenting IP validation.

Evan Carroll 2010-09-08 15:14:00

Answer 5

+1 A:

#!/usr/bin/perl

use strict;
use warnings;

sub is_ip_with_wildcards {
    my ($ip) = @_;
    my @octets = split / [.] /xms, $ip;    
    return 4 == grep { $_ eq q{*} || m/ \A [0-9]+ \z /xms && $_ < 256 } @octets;
}

while( defined( my $line = <> ) ) {
    if( my @ips = grep { is_ip_with_wildcard( $_ ) } split q{ }, $line ) {
        print 'found IPs: ', join(q{, }, @ips), "\n";
    }
}

Daniel Holz 2010-09-07 22:38:36

@Daniel - +1,... I LIKE the idea of not being stuck in RegEx land :)

DVK 2010-09-08 04:28:30

ansaurus

tags:

views:

answers:

perl + Irregular Expression (VALID IP + ADD Valid Rule)

related questions