views:

584

answers:

5

Hello,

I would like to write a Python script which allows me to delete files from a FTP Server after they have reached a certain age. I prepared the scipt below but it throws the error message: WindowsError: [Error 3] The system cannot find the path specified: '/test123/*.*'

Do someone have an idea how to resolve this issue? Thank you in advance!

import os, time
from ftplib import FTP

ftp = FTP('127.0.0.1')
print "Automated FTP Maintainance"
print 'Logging in.'
ftp.login('admin', 'admin')

# This is the directory that we want to go to
path = 'test123'
print 'Changing to:' + path
ftp.cwd(path)
files = ftp.retrlines('LIST')
print 'List of Files:' + files 
#--everything works fine until here!...

#--The Logic which shall delete the files after the are 7 days old--
now = time.time()
for f in os.listdir(path):
  if os.stat(f).st_mtime < now - 7 * 86400:
    if os.path.isfile(f):
        os.remove(os.path.join(path, f))
except:
    exit ("Cannot delete files")

print 'Closing FTP connection'
ftp.close()
A: 

Well, it looks like the error you are seeing has to do with the fact that you are trying to remove the 'test123' directory from your local machine, not the FTP site. The FTP docs have a method called delete, and that's what you'd want to use to remove the file. As far as testing whether or not something is 7 days old or not, you might actually have to pull those files down from the FTP temporarily then check the modify times before using FTP.delete.

BenHayden
No it shall jump into the directory "test123", and then delete every file from it which is older then 7 days. The machine is indicating that it is not able to find the directory.
Tom
A: 

What OS are you running on? The file path /test123/*.* is Unix-style yet the message says WindowsError. Are you taking the output of an ftp LIST command, which is in Unix-style, and trying to use it verbatim in a Windows script?

joefis
Hi is is running on Windows 2003 Server, and it connects currently to an test FTP Server wich is running on Windows XP.
Tom
A: 

OK, well rather than analyze the code you have posted any further, here's an example instead that might put you on the right track.

from ftplib import FTP
import re

pattern = r'.* ([A-Z|a-z].. .. .....) (.*)'

def callback(line):
    found = re.match(pattern, line)
    if (found is not None):
        print found.groups()

ftp = FTP('myserver.wherever.com')
ftp.login('elvis','presley')
ftp.cwd('testing123')
ftp.retrlines('LIST',callback)

ftp.close()
del ftp

Run it and you'll get output something like this, which should be a start towards what you're trying to achieve. To finish it out you'd need to parse the first result into a datetime, compare it with "now" and use ftp.delete() to get rid of the remote file if it's too old.

>>> 
('May 16 13:47', 'Thumbs.db')
('Feb 16 17:47', 'docs')
('Feb 23  2007', 'marvin')
('May 08  2009', 'notes')
('Aug 04  2009', 'other')
('Feb 11 18:24', 'ppp.xml')
('Jan 20  2010', 'reports')
('Oct 10  2005', 'transition')
>>> 
joefis
Note however that different ftp servers format the output of the LIST command differently, so you may have to modify the regular expression to match the one you're using.
joefis
Hi thank you for your answer, I will try to modify my code accordingly.
Tom
+3  A: 

OK. Assuming your FTP server supports the MLSD command, make a module with the following code (this is code from a script I use to sync a remote FTP site with a local directory):

module code

# for python ≥ 2.6
import sys, os, time, ftplib
import collections
FTPDir= collections.namedtuple("FTPDir", "name size mtime tree")
FTPFile= collections.namedtuple("FTPFile", "name size mtime")

class FTPDirectory(object):
    def __init__(self, path='.'):
        self.dirs= []
        self.files= []
        self.path= path

    def getdata(self, ftpobj):
        ftpobj.retrlines('MLSD', self.addline)

    def addline(self, line):
        data, _, name= line.partition('; ')
        fields= data.split(';')
        for field in fields:
            field_name, _, field_value= field.partition('=')
            if field_name == 'type':
                target= self.dirs if field_value == 'dir' else self.files
            elif field_name in ('sizd', 'size'):
                size= int(field_value)
            elif field_name == 'modify':
                mtime= time.mktime(time.strptime(field_value, "%Y%m%d%H%M%S"))
        if target is self.files:
            target.append(FTPFile(name, size, mtime))
        else:
            target.append(FTPDir(name, size, mtime, self.__class__(os.path.join(self.path, name))))

    def walk(self):
        for ftpfile in self.files:
            yield self.path, ftpfile
        for ftpdir in self.dirs:
            for path, ftpfile in ftpdir.tree.walk():
                yield path, ftpfile

class FTPTree(FTPDirectory):
    def getdata(self, ftpobj):
        super(FTPTree, self).getdata(ftpobj)
        for dirname in self.dirs:
            ftpobj.cwd(dirname.name)
            dirname.tree.getdata(ftpobj)
            ftpobj.cwd('..')

single directory case

If you want to work on the files of a directory, you can:

import ftplib, time

quite_old= time.time() - 7*86400 # seven days

site= ftplib.FTP(hostname, username, password)
site.cwd(the_directory_to_work_on) # if it's '.', you can skip this line
folder= FTPDirectory()
folder.getdata(site) # get the filenames
for path, ftpfile in folder.walk():
    if ftpfile.mtime < quite_old:
        site.delete(ftpfile.name)

This should do what you want.

a directory and its descendants

Now, if this should work recursively, you'll have to do the following two changes in the code for “single directory case”:

folder= FTPTree()

and

site.delete(os.path.join(path, ftpfile.name))

Possible caveat

The servers I've worked with didn't have any issues with relative paths in the STOR and DELE commands, so site.delete with a relative path worked too. If your FTP server requires pathless filenames, you should first .cwd to the path provided, .delete the plain ftpfile.name and then .cwd back to the base folder.

ΤΖΩΤΖΙΟΥ
Hi ΤΖΩΤΖΙΟΥ,thank you for your idea, it looks very good to me. I have tried it out, and I had to modidy the code slightly, but I get an error message:site= ftplib.FTP('127.0.0.1, admin, admin')File "C:\Python26\lib\ftplib.py", line 116, in __init__ self.connect(host)File "C:\Python26\lib\ftplib.py", line 131, in connectself.sock = socket.create_connection((self.host, self.port), self.timeout)for res in getaddrinfo(host, port, 0, SOCK_STREAM):socket.gaierror: [Errno 11001] getaddrinfo failed
Tom
import os, time, FTP_AUTOfrom ftplib import FTPquite_old= time.time() - 7*86400 # seven days# C:\Temp\ftp\test123site= ftplib.FTP('127.0.0.1, admin, admin')site.cwd(test123) # if it's '.', you can skip this linefolder= FTPDirectory()print folderfolder.getdata(site) # get the filenamesfor path, ftpfile in folder.walk(): if ftpfile.mtime < quite_old: site.delete(ftpfile.name)
Tom
@Tom: `'127.0.0.1, admin, admin'` is not a valid hostname; that's what the error is about. You probably meant `'127.0.0.1', 'admin', 'admin'` in your code.
ΤΖΩΤΖΙΟΥ
Thank you, the connection is now working. But the system stated that: File "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py", line 6, in <module> folder= FTPDirectory()NameError: name 'FTPDirectory' is not defined
Tom
@Tom: how did you name my module? Did you import it at the start of ftp_del.py? If you saved my code as, say, ftptool.py, then at the start of ftp_del.py you should `import ftptool` and later have the classes prefixed with the module name, e.g. `folder = ftptool.FTPDirectory()`. ISTM you need to read the Python tutorial first; it's like you lack basic knowledge about Python.
ΤΖΩΤΖΙΟΥ
Hi ΤΖΩΤΖΙΟΥ, I named your module "FTP_dir" in that case.I import it as you mentioned. Now it seems to work!The old files are deleted from my test FTP server, now I will try it on the productive environment. Thank you very much for your assistance and help!It responses in the console with <FTP_dir.FTPDirectory object at 0x00B6E590> All look GOOD!
Tom
It worked on test environment Windows Based FileZilla Server, but in productive environment I get the error: ftplib.error_perm: 500 Cannot understand 'MLSD'" Would theren be an workaround for this issue? Can the provider just switch "MLSD" commands on?
Tom
A: 

Hi ΤΖΩΤΖΙΟΥ,

thank you for your idea, it looks very good to me. I have tried it out, and I had to modidy the code slightly: FTP_Dir is your module which you have posted, it remains untached.

import os, ftplib, time, FTP_dir
quite_old= time.time() - 7*86400 # seven days
site= ftplib.FTP('127.0.0.1, admin, admin')
site.cwd('test123') # if it's '.', you can skip this line
folder= FTPDirectory()
print folder
folder.getdata(site) # get the filenames
for path, ftpfile in folder.walk():
    if ftpfile.mtime < quite_old:
        site.delete(ftpfile.name)

Here is the error message which I got:

C:/Python26/pythonw.exe -u  "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py"

Traceback (most recent call last): File "G:/MY_TCS/!!PROJECTS/Q3/FTP_auto_del/python/ftp_del.py", line 4, in site= ftplib.FTP('127.0.0.1, admin, admin') File "C:\Python26\lib\ftplib.py", line 116, in init self.connect(host) File "C:\Python26\lib\ftplib.py", line 131, in connect self.sock = socket.create_connection((self.host, self.port), self.timeout) File "C:\Python26\lib\socket.py", line 500, in create_connection for res in getaddrinfo(host, port, 0, SOCK_STREAM): socket.gaierror: [Errno 11001] getaddrinfo failed

Tom