tags:

views:

63

answers:

3

We have a script on an FTP endpoint that monitors the FTP logs spewed out by our FTP daemon. Currently what we do is have a perl script essentially runs a tail -F on the file and sends every single line into a remote MySQL database, with slightly different column content based off the record type.

This database has tables for content of both the tarball names/content, as well as FTP user actions with said packages; Downloads, Deletes, and everything else VSFTPd logs.

I see this as particularly bad, but I'm not sure what's better.

The goal of a replacement is to still get log file content into a database as quick as possible. I'm thinking doing something like making a FIFO/pipe file in place of where the FTP log file is, so I can read it in once periodically, ensuring I never read the same thing in twice. Assuming VSFTPd will place nice with that (I'm thinking it won't, insight welcome!).

The FTP daemon is VSFTPd, I'm at least fairly sure the extent of their logging capabilies are: xfer style log, vsftpd style log, both, or no logging at all.

The question is, what's better than what we're already doing, if anything?

+1  A: 

You should look into inotify (assuming you are on a nice, posix based OS) so you can run your perl script whenever the logfile is updated. If this level of IO causes problems you could always keep the logfile on a RAMdisk so IO is very fast.

This should help you set this up: http://www.cyberciti.biz/faq/linux-inotify-examples-to-replicate-directories/

ternaryOperator
This isn't so much of a "scheduled alert", because the additions to the log file happen pretty regularly. Let's say... at least once every 5 seconds, many times once a second, other times multiple times a second.This is more about how best to actively get the content and send it somewhere.[edit]Also, efficiently, and a given entry *only once*.
VxJasonxV
+3  A: 

Honestly, I don't see much wrong with what you're doing now. tail -f is very efficient. The only real problem with it is that it loses state if your watcher script ever dies (which is a semi-hard problem to solve with rotating logfiles). There's also a nice File::Tail module on CPAN that saves you from the shellout and has some nice customization available.

Using a FIFO as a log can work (as long as vsftpd doesn't try to unlink and recreate the logfile at any point, which it may do) but I see one major problem with it. If no one is reading from the other end of the FIFO (for instance if your script crashes, or was never started), then a short time later all of the writes to the FIFO will start blocking. And I haven't tested this, but it's pretty likely that having logfile writes block will cause the entire server to hang. Not a very pretty scenario.

hobbs
Perl modules and CPAN are my best friend. Despite my pointing to Net::FTP, we shell out the ftp command from within the perl script......;_;btw your rep is insane. Awesome. Community members should aspire to be as helpful as you.
VxJasonxV
A: 

You can open the file as an in pipe.

open(my $file, '-|', '/ftp/file/path');

while (<$file>) {
    # you know the rest
}

File::Tail does this, plus heuristic sleeping and nice error handling and recovery.

Edit: On second thought, a real system pipe is better if you can manage it. If not, you need to find the last thing you put in the database, and spin through the file until you get to the last thing you put in the database whenever your process starts. Not that easy to accomplish, and potentially impossible if you have no way of identifying where you left off.

masonk