views:

665

answers:

7

I wanted to match 110110 but not 10110. That means at least twice repeating of two consecutive digits which are the same. Any regex for that?

Should match: 110110, 123445446, 12344544644

Should not match: 10110, 123445

+3  A: 

If you're talking about all digits, this will do it:

00.*00|11.*11|22.*22|33.*33|44.*44|55.*55|66.*66|77.*77|88.*88|99.*99

It's just 9 different patterns OR'ed together, each of which checks for at least two occurrences of the desired 2-digit pattern.

Using Perls more advanced REs, you can use the following for two consecutive digits twice:

(\d)\1.*\1\1

or, as one of your comments states, two consecutive digits follwed somewhere by two more consecutive digits which may not be the same:

(\d)\1.*(\d)\2
paxdiablo
A: 

If I've understood your question correctly, then this, according to regexbuddy (set to using perl syntax), will match 110110 but not 10110:

(1{2})0\10

The following is more general and will match any string where two equal digits is repeated later on in the string.

(\d{2})\d+\1\d*

The above will match the following examples:

110110 110011 112345611 2200022345

Finally, to find two sets of double digits in a string and you don't care where they are, try this:

\d*?(\d{2})\d+?\1\d*

This will match the examples above plus this one:

12345501355789

Its the two sets of double 5 in the above example that are matched.

[Update] Having just seen your extra requirement of matching a string with two different double digits, try this:

\d*?(\d)\1\d*?(\d)\2\d*

This will match strings like the following:

12342202345567
12342202342267

Note that the 22 and 55 cause the first string to match and the pair of 22 cause the second string to match.

Barry Carr
thanks for the response..but all of those are not matching 12344033, i want this to be matched to..any way to do this ?
I have just edited my response to accommodate this. Good luck.
Barry Carr
That last regex is way more complicated than it needs to be. I would use `(\d)\1.*?(\d)\2`
Alan Moore
My regex will capture the whole, regardless of digits before or after the two pairs. And from my interpretation on the question, this was a requirement. In addition, you regex match anything between the two number pairs. For instance 1233 1234 22321 would match using your prosposed solution. That is, clearly, incorrect.
Barry Carr
+9  A: 
/(\d)\1.*\1\1/

This matches a string with 2 instances of a double number, ie 11011 but not 10011

\d matches any digit \1 matches the first match effectively doubling the first entry

This will also match 1111. If there needs to be other characters between change .* to .+

ooh, this looks neater

((\d)\2).*\1

If you want to find non-matching values, but there has to be 2 sets of doubles, then you would simply need to add the first part again as in

((\d)\2).*((\d)\4)

The bracketing would mean that $1 and $3 would contain the double digits and $2 and $4 contains the single digits (which are then doubled).

11233

$1=11
$2=1
$3=33
$4=3
Xetius
A: 

thanks for the responses. what if i also want to match 11233 , ie the pairs of same consecutive digits can be different ?

Add comments to the question: Don't post non-answers as answers.
Sinan Ünür
Or edit your question.
Clement Herreman
Seems a shame that you can't work it out from what you already have, surely this site isn't here to do your work for you.
Mark Dickinson
@Mark The OP should be ashamed ... Did not vote on any answers, did not accept any answers. Acted like a complete freeloader. Oh well, in most cases, thinking not about the current poster, but future posters who might be grappling with the same issue tends to help.
Sinan Ünür
@Sinan, the question was only asked a couple of hours ago. It's not necessary to accept an answer straight away, a better answer may come along.
paxdiablo
A: 

There is no reason to do everything in one regex... You can use the rest of Perl as well:

#!/usr/bin/perl -l

use strict;
use warnings;

my @strings = qw( 11233 110110 10110 123445 123445446 12344544644 );

print if is_wanted($_) for @strings;

sub is_wanted {
    my ($s) = @_;
    my @matches = $s =~ /(?<group>(?<first>[0-9])\k<first>)/g;
    return 1 < @matches / 2;
}

__END__
Sinan Ünür
A: 

depending on how your data is, here's a minimal regex way.

while(<DATA>){
    chomp;
    @s = split/\s+/;
    foreach my $i (@s){
        if( $i =~ /123445/ && length($i) ne 6){
            print $_."\n";
        }
    }
}

__DATA__
  This is a line
  blah 123445446 blah
  blah blah 12344544644 blah
  .... 123445 ....
  this is last line
ghostdog74
+7  A: 

If I understand correctly, your regexp will be:

m{
  (\d)\1            # First repeated pair
  .*                # Anything in between
  (\d)\2            # Second repeated pair
}x

For example:

for my $x (qw(110110 123445446 12344544644 10110 123445)) {
    my $m = $x =~ m{(\d)\1.*(\d)\2} ? "matches" : "does not match";
    printf "%-11s : %s\n", $x, $m;
}
110110      : matches
123445446   : matches
12344544644 : matches
10110       : does not match
123445      : does not match
depesz