views:

227

answers:

2

I've got a PHP script on a shared webhost that selects from ~300 'feeds' the 40 that haven't been updated in the last half hour, makes a cURL request and then delivers it to the user.

SELECT * FROM table WHERE latest_scan < NOW() - INTERVAL 30 MINUTE ORDER BY latest_scan ASC LIMIT 0, 40;
// Make cURL request and process it

I want to be able to deliver updates as fast as possible, but don't want to bog down my server or the servers I'm fetching from (it's only a handful).

How often should I run the cron job, and should I limit the number of fetches per run? To how many?

+1  A: 

It would be a good thing to "rate" how often each feed actually changes so if something has an average time of 24 hours per change, then you just fetch is every 12 hours.

Just store #changes and #try's and pick the ones you need to check... you can run the script every minute and let some statistics do the rest!

DFectuoso
This might work, but the feeds are only stored for a week or so.
Gilean
A: 

On a shared host you might also run into script run time issues. For instance, if your script runs longer than 30 seconds the server may terminate. If this is the case for your host, you might want to do some tests/logging of how long it takes to process each feed and take that into consideration when you figure out how many feeds you should process at the same time.

Another thing I had to do to help fix this was mark the "last scan" as updated before I processed each individual request so that a problem feed would not continue to fail and be picked up for each cron run. If desired, you can update the entry again on failure and specify a reason (if known) why the failure occurred.

Beau Simensen
Thats just a bad practice, if you change the update date even if it fails, how do you know it failed?
DFectuoso
I'm not building the entire project for this person, just offering some additional things to look out for. If so desired, the entry can be updated post-failure with the error reason/code in a "last failure" field or something. Updated my answer to include hints on this as well.
Beau Simensen