#!/usr/bin/perl
use strict; use warnings;
my @ts;
for (1 .. 10) {
push @ts, time;
sleep rand 3;
}
my $now = time;
@ts = grep { $now - $_ <= 5 } @ts;
print $_, "\n" for @ts;
- Store the timestamps per IP address in order. You were probably going to do that anyway.
- Whenever you do get a log line and you add a new entry, remove any stale entries right there, before you check how many entries there are. You can do it easily with a
grep
. - Periodically (once a minute?) delete any IP addresses from the hash that have a last (newest) timestamp more than 5 minutes ago, because that means that all the entries are more than 5 minutes old and that address hasn't been seen for a while.
It's simple, it's easy to prove correct, it tries to avoid doing too much work at one time, and it keeps your tables from getting unreasonably large. With a 1-minute interval for step 3, no entry can possibly live more than 11 minutes. (If the first entry for 1.2.3.4 was added at 00:00:00, the latest an an entry could be added without shifting off the first one would be 00:04:59. The latest a step 3 sweep can run without deleting the whole array would then be 00:09:58; assuming that worst case, the next sweep would be at 00:10:58.) If you can keep 11 minutes of data in memory, you're golden.
This sounds like you want a least-recently used (LRU) cache of some sort. Although I don't often recommend it, I think this is a job for a tied hash or array. You STORE
new elements and as you do so, you clean out old elements. This takes the complexity out of the higher elements and hides it behind the normal array or hash accesses. Look at Tie::Cache for an example.
Alternately, you could keep some sort of FIFO where add new elements from one end of an array then check the other end for items to delete.