You can use backreferences to find pairs of things that appear in a row:
(\d+)\1
This will match one or more digit character followed by the same sequence again. \1
is a backreference which refers to the contents of the first capturing group.
If you want to match numbers that appear multiple times in the string, you could use a pattern like
(\d)(?=\d*\1)
Again we're using a backreference, but this time we also use a lookahead as well. A lookahead is a zero-width assertion which specifies something that must be matched (or not matched, if using a negative lookahead) after the current position in the string, but doesn't consume any characters or move the position the regex engine is at in the string. In this case, we will assert that the contents of the first capture group must be found again, though not necessarily directly beside the first one. By specifying \d*
within the lookahead, it will only be considered a pair if it is within the same number (so if there's a space between numbers, the pair won't be matched -- if this is undesired, the \d
can be changed to .
, which will match any character).
It'll match the first 3 and 4 in 34342
and the first 1, 2, 3, and 4 in 12332144
. Note however that if you have an odd number of repetitions, you will get an extra match (ie. 1112
will match the first two 1s), because lookaheads do not consume.