views:

287

answers:

2

I want to download a list of web pages. I know wget can do this. However downloading every URL in every five minutes and save them to a folder seems beyond the capability of wget. Does anyone knows some tools either in java or python or Perl which accomplishes the task?

Thanks in advance.

+5  A: 

Write a bash script that uses wget and put it in your crontab to run every 5 minutes. (*/5 * * * *)

If you need to keep a history of all these web pages, set a variable at the beginning of your script with the current unixtime and append it to the output filenames.

Dave Pirotte
Just curious: if the history part you described isn't needed, why whould you want to wrap the wget command in a (bash) script? You can also just call wget from cron, right?
Mark van Lent
since there's a series (group) of pages?
KevinDTimm
Erm... Yes, that makes sense. :)
Mark van Lent
+7  A: 

Sounds like you'd want to use cron with wget


But if you're set on using python:

import time
import os

wget_command_string = "wget ..."

while true:
    os.system(wget_command_string)
    time.sleep(5*60)
Jweede
Does python have a launchd interface?
Nerdling