How to remove files inside a directory that have more or less lines than specified (all files have ".txt" suffix)?
+1
A:
Try this bash script:
LINES=10
for f in *.txt; do
if [ `cat "$f" | wc -l` -ne $LINES ]; then
rm -f "$f"
fi
done
(Not tested)
EDIT: Use a pipe to feed in wc, as wc prints the filename as well.
0x6adb015
2009-06-01 15:15:44
Doesn't work here: "line 3: [: too many arguments"
schnaader
2009-06-01 15:24:18
I also tried to do this: a=`wc -l "$f`; if [ "$a" -ne $LINES ]; This would work, but wc -l does output the count and the filename...
schnaader
2009-06-01 15:28:58
+1, as this was the prototype to my answer :)
schnaader
2009-06-01 15:35:44
Arg! do [ `cat "$f" | wc -l` -ne $LINES ];
0x6adb015
2009-06-01 15:41:11
+3
A:
Played a bit with the answer from 0x6adb015. This works for me:
LINES=10
for f in *.txt; do
a=`cat "$f" | wc -l`;
if [ "$a" -ne "$LINES" ]
then
rm -f "$f"
fi
done
schnaader
2009-06-01 15:33:55
+6
A:
This bash script should do the trick. Save as "rmlc.sh".
Sample usage:
rmlc.sh -more 20 *.txt # Remove all .txt files with more than 20 lines
rmlc.sh -less 15 * # Remove all files with fewer than 20 lines
Note that if the rmlc.sh script is in the current directory, it is protected against deletion.
#!/bin/sh
# rmlc.sh - Remove by line count
SCRIPTNAME="rmlc.sh"
IFS=""
# Parse arguments
if [ $# -lt 3 ]; then
echo "Usage:"
echo "$SCRIPTNAME [-more|-less] [numlines] file1 file2..."
exit
fi
if [ $1 == "-more" ]; then
COMPARE="-gt"
elif [ $1 == "-less" ]; then
COMPARE="-lt"
else
echo "First argument must be -more or -less"
exit
fi
LINECOUNT=$2
# Discard non-filename arguments
shift 2
for filename in $*; do
# Make sure we're dealing with a regular file first
if [ ! -f "$filename" ]; then
echo "Ignoring $filename"
continue
fi
# We probably don't want to delete ourselves if script is in current dir
if [ "$filename" == "$SCRIPTNAME" ]; then
continue
fi
# Feed wc with stdin so that output doesn't include filename
lines=`cat "$filename" | wc -l`
# Check criteria and delete
if [ $lines $COMPARE $LINECOUNT ]; then
echo "Deleting $filename"
rm "$filename"
fi
done
Kevin Ivarsen
2009-06-01 15:40:20
My only issue with this is the "gratuitous use of cat". wc -l can operate on a file all by itself: wc -l "$filename" is all you need.
Harper Shelby
2009-06-01 15:48:33
Harper: I originally tried "wc -l" by itself. The problem is that the output includes the filename rather than just the line number. For example, "wc -l rmlc.sh" outputs "48 rmlc.sh", while "echo rmlc.sh | wc -l" simply outputs "48".
Kevin Ivarsen
2009-06-01 15:52:23
this will fail on filenames containing spaces, and iirc on large directories. See my "find" based comment for one way around that.
simon
2009-06-01 15:59:22
Kevin's script works great, so does Simon's solution. No flaws, even though I deal with more than 4 000 files. If I could I would accept both :)Thank you all for your answers, I greatly appreciate your help!
Daniel
2009-06-01 16:44:15
A:
My command line mashing is pretty rusty, but I think something like this will work safely (change the "10" to whatever number of lines in the grep) even if your filenames have spaces in them. Adjust as needed. You'd need to tweak it if newlines in filenames are possible.
find . -name \*.txt -type f -exec wc -l {} \; | grep -v "^10 .*$" | cut --complement -f 1 -d " " | tr '\012' '\000' | xargs -0 rm -f
simon
2009-06-01 15:55:12
Thank you Simon, both your command line and Kevin's script work perfectly, even though I have more than 4 000 files :)
Daniel
2009-06-01 16:38:06
+1
A:
This one liner should also do
find -name '*.txt' | xargs wc -l | awk '{if($1 > 1000 && index($2, "txt")>0 ) print $2}' | xargs rm
In the example above, files greater than 1000 lines are deleted.
Choose > and < and the number of lines accordingly.
Sathya
2009-06-01 15:59:18