Hi there, I'm trying to write a bash script that looks at a directory full of files and categorises them as either plaintext or binary. A file is plaintext if it ONLY contains plaintext characters, otherwise it is binary. So far I have tried the following permutations of grep:
#!/bin/bash
FILES=`ls`
for i in $FILES
do
########GREP SYNTAX###########
if grep -qv -e[:cntrl:] $i
########/GREP SYNTAX##########
then
mv $i $i-plaintext.txt
else
mv $i $i-binary.txt
fi
done
In the grep syntax line, I have also tried the same without the -v flag and swapping the branches of the if statements, as well as both combinations of the same with [:alnum:] and [:print:]. All six of these variations produce some files labelled binary wich consist solely of plantext and some files labelled plaintext which contain at least one non-printable character.
I need to find a way to identify files that only contain printable characters i.e. A-Z, a-z, 0-9, punctuation, spaces and new lines. All files containing any character that is not in this set shoudl be classified as binary.
I've been bashing my head against a wall trying to sort this for half a day. Help! Thanks in advance, Rik