views:

56

answers:

2

I am working on my bachelor's project and I'm trying to figure out a simple dilemma.

It's a website of a football club. There is some information that will be fetched from the website of national football association (basically league table and matches history). I'm trying to decide the best way to store this fetched data. I'm thinking about two possibilities:

1) I will set up a cron job that will run let's say every hour. It will call a script that will fetch the league table and all other data from the website and store them in a flat file.

2) I will use Zend_Cache object to do the same, except the data will be stored in cached files. The cache will get updated about every hour as well.

Which is the better approach?

+1  A: 

Well if you choose 1 it somewhat adds complexity because you have to use cron as well (not that cron is overly complex) and then you have to test that the data file is complete before using it or deall with moving files from a temp location after they have downloaded and been parsed to the proper format.

If you use two it eliminates much of 1, except now on the request where the cache is dead you have to wait for the download/parse.

I would say 1 is the better option, but 2 is going to be easier to implement and less prone to error. That said its fairly trivial to implement things in the cron script to prevent the negatives i describe. So i would probably go with 1.

prodigitalson
+2  A: 

I think the answer can be found in why you want to cache the file. Is it to place minimal load on the external server by only updating the cache every so often, or is it to keep pages loading fast because the file takes long to download or process?

If it's only to respect the other server, and fetching/processing the page takes little noticable time to the end user, I'd just implement Zend_Cache. It's simple, you don't have to worry about one script downloading the page, then another script loading the downloaded data (plus the cron job).

If the cache is also needed because fetching/processing the page is significant, I'd still use Zend_Cache; however, I'd set the cache to expire every 2 hours, and setup a cron job (or something similar) to manually update the cache every hour. Sure, this adds back the complexity of two scripts (or at least adding a request flag to manually refresh the cache), but should the cron job fail, you're still fine.

Tim Lytle
It's because I want to decrease load on the external server.
Richard Knop
There you go, just use Zend_Cache.
Tim Lytle