views:

99

answers:

3

I need to somehow automatically update/parse a couple of RSS feeds and place them into a MySQL database almost as soon as the feed is updated, or as close as possible. However, I can't work out the best way to do this automatically - I've found tutorials for doing it when a user runs a script - but in this case it all needs to be done in the background. Would a cron job be suitable?

Any ideas? Any advice is greatly appreciated thanks.

A: 

It can only ever be done in response to something — a cron job just means "In response to it being a certain time." You have to decide what event is best for your particular circumstances.

Assuming you don't control the source of the RSS feeds, doing it periodically via cron makes sense. To have it run "as soon as the feed is updated, or as close as possible" you would have to poll every second, which would make you highly unpopular. Check no more frequently than hourly (unless the feed includes information giving a different check period).

David Dorward
Hmm problem is its a high intensity feed - i.e. it's updated at least every 3 or 4 minutes. Any other ideas?
Bronwyn
Then check to see if the feed includes data saying you can poll it more frequently, or contact the person responsible for the site and ask permission.
David Dorward
And use a cron job to do the running of the script?
Bronwyn
I don't think the fact that the feed is updated every 4 or 5 minutes is a problem - if the content creator creates every 4/5 mins then they should expect people to grab the updates every 4/5 mins too. And to be honest, making a request on a website every 4/5 mins is not exactly thrashing it.A cron job seems to be a good fit for what you are trying to do.
Steve Claridge
A: 

If you do control the source of the RSS Feed, have a look at the other Observer pattern. If not, check if the source feed supports PubSubHubbub:

A simple, open, server-to-server web-hook-based pubsub (publish/subscribe) protocol as an extension to Atom and RSS. Parties (servers) speaking the PubSubHubbub protocol can get near-instant notifications (via webhook callbacks) when a topic (feed URL) they're interested in is updated.

Gordon
Unfortunately it doesn't support PubSubHubbub protocol - I'm running out of ideas here. Somehow I need to get the info from the feed which is constantly updated (it's an emergency services feed) and "cache" it in MySQL.
Bronwyn
+1  A: 

You should check out Zend_Feed_Reader.
Zend_Feed_Reader provides HTTP Conditional GET Support.
If the Feeds is proper configured your script only has to download and parse the Feed if it has even changed.

You don't need the full Zend Framework. Zend_Feed_Reader has very few dependencies so it can be used standalone.

Benjamin Cremer