I need some way to find words that contain any combination of characters and digits but exactly 4 digits only.
EXAMPLE:
a1a1a1a1 //OK
1234 // Not
a1a1a1a1a1 Not
I need some way to find words that contain any combination of characters and digits but exactly 4 digits only.
EXAMPLE:
a1a1a1a1 //OK
1234 // Not
a1a1a1a1a1 Not
With grep
:
grep -iE '^([a-z]*[0-9]){4}[a-z]*$' | grep -vE '^[0-9]{4}$'
Do it in one pattern with Perl:
perl -ne 'print if /^(?!\d{4}$)([^\W\d_]*\d){4}[^\W\d_]*$/'
The funky [^\W\d_]
character class is a cosmopolitan way to spell [A-Za-z]
: it catches all letters rather than only the English ones.
Assuming you only need ASCII, and you can only access the (fairly primitive) regexp constructs of grep
, the following should be pretty close:
grep ^[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*[a-zA-Z]*[0-9][a-zA-Z]*$ | grep [a-zA-Z]
You might try
[^0-9]*[0-9][^0-9]*[0-9][^0-9]*[0-9][^0-9]*[0-9][^0-9]*
But this will match 1234. why doesn't that match your criteria?
to match a digit in grep you can use [0-9]. To match anything but a digit, you can use [^0-9]. Since that can be any number of , or no chars, you add a "*" (any number of the preceding). So what you'll want is logically
(anything not a digit or nothing)* (any single digit) (anything not a digit or nothing)* .
...
until you have 4 "any single digit" groups. i.e. [^0-9]*[0-9]...
I find with grep long patterns, especially with long strings of special chars that need to be escaped, it's best to build up slowly so you're sure you understand whats going on. For example,
#this will highlight your matches, and make it easier to understand
alias grep='grep --color=auto'
echo 'a1b2' | grep '[0-9]'
will show you how it's matching. You can then extend the pattern once you understand each part.
The regex for that is:
([A-Za-z]\d){4}
I'm not sure about all the other input you might take (i.e. is ax12ax12ax12ax12
valid?), but this will work based on what you posted:
%> grep -P "^(?:\w\d){4}$" fileWithInput
If you don't mind using a little shell as well, you could do something like this:
echo "a1a1a1a1" |grep -o '[0-9]'|wc -l
which would display the number of digits found in the string. If you like, you could then test for a given number of matches:
max_match=4
[ "$(echo "a1da4a3aaa4a4" | grep -o '[0-9]'|wc -l)" -le $max_match ] || echo "too many digits."
you can use normal shell script, no need complicated regex.
var=a1a1a1a1
alldigits=${var//[^0-9]/}
allletters=${var//[0-9]/}
case "${#alldigits}" in
4)
if [ "${#allletters}" -gt 0 ];then
echo "ok: 4 digits and letters: $var"
else
echo "Invalid: all numbers and exactly 4: $var"
fi
;;
*) echo "Invalid: $var";;
esac
Hello all , thanks for your answers finaly i wrote some script and it work perfect: . /P ab2b2 cd12 z9989 1ab26a9 1ab1c1 1234 24 a2b2c2d2
echo "$@" |tr -s " " "\n"s >> sorting cat sorting |while read tostr do l=$(echo $tostr|tr -d "\n"|wc -c) temp=$(echo $tostr|tr -d a-z|tr -d "\n"|wc -c) if [ $temp -eq 4 ];then if [ $l -gt 4 ];then printf "%s " "$tostr" fi fi done echo