views:

77

answers:

2

Python 2.6

My script needs to monitor some 1G files on the ftp, when ever it's changed/modified, the script will download it to another place. Those file name will remain unchanged, people will delete the original file on ftp first, then upload a newer version. My script will checking the file metadata like file size and date modified to see if any difference.

The question is when the script checking metadata, the new file may be still being uploading. How to handle this situation? Is there any file attribute indicates uploading status (like the file is locked)? Thanks.

+1  A: 

There is no such attribute. You may be unable to GET such file, but it depends on the server software. Also, file access flags may be set one way while the file is being uploaded and then changed when upload is complete; or incomplete file may have modified name (e.g. original_filename.ext.part) -- it all depends on the server-side software used for upload.

If you control the server, make your own metadata, e.g. create an empty flag file alongside the newly uploaded file when upload is finished.

In the general case, I'm afraid, the best you can do is monitor file size and consider the file completely uploaded if its size is not changing for a while. Make this interval sufficiently large (on the order of minutes).

atzz
+1  A: 

Your question leaves out a few details, but I'll try to answer.

  • If you're running your status checker program on the same server thats running ftp:

1) Depending on your operating system, if you're using Linux and you've built inotify into your kernel you could use pyinotify to watch your upload directory -- inotify distinguishes from open, modify, close events and lets you asynchronously watch filesystem events so you're not polling constantly. OSX and Windows both have similar but differently implemented facilities.

2) You could pythonically tail -f to see when a new file is put on the server (if you're even logging that) and just update when you see related update messages.

  • If you're running your program remotely

3) If your status checking utility has to run on a remote host from the FTP server, you'd have to poll the file for status and build in some logic to detect size changes. You can use the FTP 'SIZE' command for this for an easily parse-able string.

You'd have to put some logic into it such that if the filesize gets smaller you would assume it's being replaced, and then wait for it to get bigger until it stops growing and stays the same size for some duration. If the archive is compressed in a way that you could verify the sum you could then download it, checksum, and then reupload to the remote site.

synthesizerpatel
Thanks for the detail answer. My program is running on remote server, and everything is under Windows platform.
Stan