ansaurus

Question

De-dupe files in BASH

Answer 1

+1 A:

Try

#!/usr/bin/env bash
cd /Users/dd/Desktop/images
for FILENAME in $(find . -name *version_ids.txt -print)
do
  sort -u "$FILENAME" > "$FILENAME.tmp"
  mv "$FILENAME" "$FILENAME.bak" && mv "$FILENAME.tmp" "$FILENAME"
done

Note that this script is still not safe from problematic filenames (those with spaces or newlines in them).

Aaron Digulla 2009-10-07 10:03:12

Thanks - I'll drop the .bak bit but I can see it's good practice. (I have an alternate backup already...)

Dycey 2009-10-07 10:07:51

Answer 2

A:

You can't do $TEMP > $FILENAME

#!/usr/bin/env bash
cd /Users/dd/Desktop/images
TEMP="/tmp/$(basename $0).$RANDOM.txt"
for FILENAME in $(find . -name *version_ids.txt -print)
do
  <"$FILENAME" sort -u >"$TEMP"
  cat "$TEMP" >"$FILENAME"
done

Douglas Leeder 2009-10-07 10:05:46

Answer 3

+1 A:

GNU sort is able to edit a file in place:

sort -u -o $FILENAME $FILENAME

mouviciel 2009-10-07 10:09:22

ansaurus

tags:

views:

answers:

De-dupe files in BASH

related questions