views:

411

answers:

5

How would I validate that a jpg file is a valid image file. We are having files written to a directory using FTP, but we seem to be picking up the file before it has finished writing it, creating invalid images. I need to be able to identify when it is no longer being written to. Any ideas?

+3  A: 

Easiest way might just be to write the file to a temporary directory and then move it to the real directory after the write is finished.

Or you could check here.

JPEG::Error

[arguments: none] If the file reference remains undefined after a call to new, the file is to be considered not parseable by this module, and one should issue some error message and go to another file. An error message explaining the reason of the failure can be retrieved with the Error method:

EDIT:

Image::TestJPG might be even better.

drby
+4  A: 

You're solving the wrong problem, I think.

What you should be doing is figuring out how to tell when whatever FTPd you're using is done writing the file - that way when you come to have the same problem for (say) GIFs, DOCs or MPEGs, you don't have to fix it again.

Precisely how you do that depends rather crucially on what FTPd on what OS you're running. Some do, I believe, have hooks you can set to trigger when an upload's done.

If you can run your own FTPd, Net::FTPServer or POE::Component::Server::FTP are customizable to do the right thing.

In the absence of that:

1) try tailing the logs with a Perl script that looks for 'upload complete' messages 2) use something like lsof or fuser to check whether anything is locking a file before you try and copy it.

Penfold
I agree that this is solving the wrong problem, but I don't have any control over the FTP process and it is a very specific problem (They only produce JPEG images). I would prefer to have some "Copy Complete" indicator, but that is not possible.
Xetius
See edit above. :D
Penfold
A: 

I had something similar come up once, more or less what I did was:

var oldImageSize = 0;
var currentImageSize;

while((currentImageSize = checkImageSize(imageFile)) != oldImageSize){
    oldImageSize = currentImageSize;
    sleep 10;
}

processImage(imageFile);
dsm
A: 

Have the FTP process set the readonly flag, then only work with files that have the readonly flag set.

Brad Gilbert
+1  A: 

Again looking at the FTP issue rather than the JPG issue.

I check the timestamp on the file to make sure it hasn't been modified in the last X (5) mins - that way I can be reasonably sure they've finished uploading

# time in seconds that the file was last modified
my $last_modified = (stat("$path/$file"))[9];

# get the time in secs since epoch (ie 1970) 
my $epoch_time = time();

# ensure file's not been modified during the last 5 mins, ie still uploading
unless ( $last_modified >= ($epoch_time - 300)) {
    # move / edit or what ever
}
Ranguard