A web app I'm working on requires frequent parsing of diverse web resources (HTML, XML, RSS, etc). Once downloaded, I need to cache these resources to minimize network load. The app requires a very straightforward cache policy: only re-download a cached resource when more than X minutes have passed since the access time.
Should I:
- Store both the access time (e.g. 6/29/09 at 10:50 am) and the resource itself in the database.
- Store the access time and a unique identifier in the database. The unique identifier is the filename of the resource, stored on the local disk.
- Use another approach or third party software solution.
Essentially, this question can be re-written as, "Which is better for storing moderate amounts of data - a database or flat files?"
Thanks for your help! :)
NB: The app is running on a VPS, so size restrictions on the database/flat files do not apply.