tags:

views:

221

answers:

9

I need some way to find words that contain any combination of characters and digits but exactly 4 digits only.

EXAMPLE:

a1a1a1a1   //OK
1234 // Not
a1a1a1a1a1 Not
A: 

With grep:

grep -iE '^([a-z]*[0-9]){4}[a-z]*$' | grep -vE '^[0-9]{4}$'

Do it in one pattern with Perl:

perl -ne 'print if /^(?!\d{4}$)([^\W\d_]*\d){4}[^\W\d_]*$/'

The funky [^\W\d_] character class is a cosmopolitan way to spell [A-Za-z]: it catches all letters rather than only the English ones.

Greg Bacon
A: 

Assuming you only need ASCII, and you can only access the (fairly primitive) regexp constructs of grep, the following should be pretty close:

grep ^[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*$ | grep [a-zA-Z]
Hank Gay
A: 

You might try

[^0-9]*[0-9][^0-9]*[0-9][^0-9]*[0-9][^0-9]*[0-9][^0-9]*

But this will match 1234. why doesn't that match your criteria?

Richard Pennington
+2  A: 

to match a digit in grep you can use [0-9]. To match anything but a digit, you can use [^0-9]. Since that can be any number of , or no chars, you add a "*" (any number of the preceding). So what you'll want is logically

(anything not a digit or nothing)* (any single digit) (anything not a digit or nothing)* ....

until you have 4 "any single digit" groups. i.e. [^0-9]*[0-9]...

I find with grep long patterns, especially with long strings of special chars that need to be escaped, it's best to build up slowly so you're sure you understand whats going on. For example,

#this will highlight your matches, and make it easier to understand
alias grep='grep --color=auto'
echo 'a1b2' | grep '[0-9]'

will show you how it's matching. You can then extend the pattern once you understand each part.

Steve B.
A: 

The regex for that is:

([A-Za-z]\d){4}
  • [A-Za-z] - for character class
  • \d - for number
  • you wrapp them in () to group them indicating the format character follow by number
  • {4} - indicating that it must be 4 repetitions
DJ
+1  A: 

I'm not sure about all the other input you might take (i.e. is ax12ax12ax12ax12 valid?), but this will work based on what you posted:

%> grep -P "^(?:\w\d){4}$" fileWithInput
RC
You might want to use the `\b` word boundary instead of BOL (^) and EOL ($) in some circumstances.
Dennis Williamson
@Dennis. Good point. I was writing it to match the input he gave, but if there are multiple words per line then yes I should use the \b in place of ^ and $.
RC
A: 

If you don't mind using a little shell as well, you could do something like this:

echo "a1a1a1a1" |grep -o '[0-9]'|wc -l

which would display the number of digits found in the string. If you like, you could then test for a given number of matches:

max_match=4
[ "$(echo "a1da4a3aaa4a4" | grep -o '[0-9]'|wc -l)" -le $max_match ] || echo "too many digits."
vezult
A: 

you can use normal shell script, no need complicated regex.

var=a1a1a1a1
alldigits=${var//[^0-9]/}
allletters=${var//[0-9]/}
case "${#alldigits}" in
   4)
    if [ "${#allletters}" -gt 0 ];then
        echo "ok: 4 digits and letters: $var"
    else
        echo "Invalid: all numbers and exactly 4: $var"
    fi
    ;;
   *) echo "Invalid: $var";;
esac
actualy i wrote the script:)#!/bin/bashecho "$@" |tr -s " " "\n"s >> sortingcat sorting |while read tostrdol=$(echo $tostr|tr -d "\n"|wc -c)temp=$(echo $tostr|tr -d a-z|tr -d "\n"|wc -c)if [ $temp -eq 4 ];then if [ $l -gt 4 ];thenprintf "%s " "$tostr"fifidoneecho
Leo
A: 

Hello all , thanks for your answers finaly i wrote some script and it work perfect: . /P ab2b2 cd12 z9989 1ab26a9 1ab1c1 1234 24 a2b2c2d2

!/bin/bash

echo "$@" |tr -s " " "\n"s >> sorting cat sorting |while read tostr do l=$(echo $tostr|tr -d "\n"|wc -c) temp=$(echo $tostr|tr -d a-z|tr -d "\n"|wc -c) if [ $temp -eq 4 ];then if [ $l -gt 4 ];then printf "%s " "$tostr" fi fi done echo

Leo
It may work perfectly, but you have to admit that it looks pretty shitty.
innaM