views:

26

answers:

2

Ok, I got a directory (for instance, named '/photos') in which there are different directories (like '/photos/wedding', '/photos/birthday', '/photos/graduation', etc...) which have .jpg files in them. Unfortunately, some of jpeg files are broken. I need to find a way how to determine, which files are broken. I found out, that there is tool named imagemagic, which can help a lot. If you use it like this:

identify -format '%f' whatever.jpg

it prints the name of the file only if file is valid, if it is not it prints something like "identify: Not a JPEG file: starts with 0x69 0x75 `whatever.jpg' @ jpeg.c/EmitMessage/232.". So the correct solution should be find all files ending with ".jpg", apply to them "identify", and if the result is just the name of the file - don't do anything, and if the result is different from the name of the file - then save the name of the file somethere (like in a file "errors.txt").

Any ideas how I can probably do that?

+1  A: 

This script will print out the names of the bad files:

#!/bin/bash

find /photos -name '*.jpg' | while read FILE; do
    if [[ $(identify -format '%f' "$FILE" 2>/dev/null) != $FILE ]]; then
        echo "$FILE"
    fi
done

You could run it as is or as ./badjpegs > errors.txt to save the output to a file.

To break it down, the find command finds *.jpg files in /photos or any of its subdirectories. These file names are piped to a while loop, which reads them in one at a time into the variable $FILE. Inside the loop, we grab the output of identify using the $(...) operator and check if it matches the file name. If not, the file is bad and we print the file name.

It may be possible to simplify this. Most UNIX commands indicate success or failure in their exit code. If the identify command does this, then you could simplify the script to:

#!/bin/bash

find /photos -name '*.jpg' | while read FILE; do
    if ! identify "$FILE" &> /dev/null; then
        echo "$FILE"
    fi  
done

Here the condition is simplified to if ! identify; then which means, "did identify fail?"

John Kugelman
+1  A: 

You can put this into bash script file or run directly: find |grep ".jpg$" |xargs identify -format '%f' 1>ok.txt 2>errors.txt

Ville Laitila
Can also be written as `find -name '*.jpg' -exec identify -format "%f" {} \; 1>ok.txt 2>errors.txt`.
John Kugelman
Mark it as accepted, however, the final script is: find -name '*.jpg' -exec identify -format "%f\n" {} \; 2>errors.txtThat might be exactly what I need, tested on test data and errors.txt give me all necessary info (ok.txt makes no good to me, so I deleted it from the script).Thanks all who participated!
Graf