We have a ftp system setup to monitor/download from remote ftp servers that are not under our control. The script connects to the remote ftp, and grabs the file names of files on the server, we then check to see if its something that has already been downloaded. If it hasn't been downloaded then we download the file and add it to the list.
We recently ran into an issue, where someone on the remote ftp side, will copy in a massive single file(>1GB) then the script will wake up see a new file and begin downloading the file that is being copied in.
What is the best way to check this? I was thinking of grabbing the file size waiting a few seconds checking the file size again and see if it has increased, if it hasn't then we download it. But since time is of the concern, we can't wait a few seconds for every single file set and see if it's file size has increased.
What would be the best way to go about this, currently everything is done via pythons ftplib, how can we do this aside from using the aforementioned method.
Yet again let me reiterate this, we have 0 control over the remote ftp sites.
Thanks.
UPDATE1:
I was thinking what if i tried to rename it... since we have full permissions on the ftp, if the file upload is in progress would the rename command fail?
We don't have any real options here... do we?
UPDATE2: Well here's something interesting some of the ftps we tested on appear to automatically allocate the space once the transfer starts.
E.g. If i transfer a 200mb file to the ftp server. While the transfer is active if i connect to the ftp server and do a size while the upload is happening. It shows 200mb for the size. Even though the file is only like 10% complete.
Permissions also seem to be randomly set the FTP Server that comes with IIS sets the permissions AFTER the file is finished copying. While some of the other older ftp servers set it as soon as you send the file.
:'(