views:

207

answers:

3

I'm wondering how I would download all _*.xml_ files from a folder I have set up on an FTP server using Net::FTP. I have seen that glob() would be the best way, but I cannot quite wrap my head around the logic.

Basically, I need to check if there are XML files in the folder. If not, wait 5 seconds, and check again. Once the files show up, then I need to download them and run them through a Java app (this part I've already got down).

So, wonderful anonymous helpers, how can I monitor a folder for a specific filetype, and automatically ftp->get those files when they appear?

Thanks!

A: 

What about something like this? This would of course be called every X seconds by your code.

my %downloaded;

sub check_for_new {
    # Get all files
    my @files = $ftp->ls;

    foreach $f (@files) {

        # Check if it is an XML file
        if($f =~ /\.xml$/) {

            # Check if you already fetched it
            if(!$downloaded{$f}) {

                if($ftp->get($f)) {
                    $downloaded{$f} = 1;
                } else {
                    # Get failed
                }

            }
        }
    }

}
Jeff B
A: 

If you need to re-download xml files that might have changed then you also need to do a file compare to make sure that your local copy is in sync with the remote copy on the ftp server.

use Cwd;
use Net::FTP;
use File::Compare qw(compare);

my %localf;
my $cdir = cwd;

sub get_xml {
  for my $file ($ftp->ls) {
    ##Skip non-xml files
    next if $file !~ m/\.xml$/;

    ##Simply download if we do not have a local copy
    if (!exists $localf{$file}) {
      $ftp->get($file);
      $localf($file) = 1;
    } 
    ##else compare the server version with the local copy
    else {
      $ftp->get($file, "/tmp/$file");
      if (compare("$cdir/$file", "/tmp/$file") == 1) {
        copy("/tmp/$file", "$cdir/$file");
      }
      unlink "/tmp/$file";
    }
  }
}

I typed this out straight into the reply box so it might need a few touch-ups and error checking thrown in before being implemented. For the outer logic you could write a loop which establishes the ftp connection, calls this subroutine, closes the connection and sleeps for 'n' seconds.

muteW
+1  A: 

When I need to get a filtered listing of files on an ftp site I use grep with the ls method of Net::FTP.

warning, untested code:

#!/usr/bin/perl

use strict;
use warnings;

use Net::FTP;

#give END blocks a chance to run if we are killed
#or control-c'ed
$SIG{INT} = $SIG{TERM} = sub { exit };

my $host = shift;
my $wait = 5;

dbmopen my %seen, "files_seen.db", 0600
    or die "could not open database: $!";

while (1) {
    my $ftp = Net::FTP->new($host, Debug => 0)
        or die "Cannot connect to $host: $@";

    END { $ftp->quit if $ftp } #close ftp connection when exiting

    $ftp->login("ftp",'ftp') #anonymous ftp
        or die "Cannot login: ", $ftp->message;

    for my $file (grep { /[.]xml$/ and not $seen{$_} } $ftp->ls) {
        $ftp->get($file)
            or die "could not get $file: ", $ftp->message;
        #system("/path/to/javaapp", $file) == 0
        #   or die "java app blew up";
        $seen{$file} = 1;
    }
    sleep $wait;
}
Chas. Owens
Works perfectly, thanks! Now I just have to overcome a few problems that arose in the Java application...
ryantmer