views:

67

answers:

3

I'm using simpleXML to go through the XML results of a Twitter XML file, but I'm completely lost as to caching the results with PHP. This article seems to be of some help, but I've come across memcache (and memcached. C'mon, namers.) as well, and I have no idea what to do.

I'm using this:

$sxml = simplexml_load_file(
    'http://api.twitter.com/1/qworky/lists/qworkyteam/statuses.xml');

foreach($sxml->status as $status){
    $name = $status->user->name;
    $image = $status->user->profile_image_url;
    $update = $status->text;
    $url = "http://twitter.com/" . $status->user->screen_name;
}

to simply store the XML data of a Twitter list into usable variables. But what's the right thing to do? Create a cache file and only run this block of PHP if the cache file is older than ten minutes, otherwise serve up the cached variables? How do I pass the cached variables back-and-forth between the cached file and the DOM? Heck, what kind of extension and filename does a cache file have?

Thanks so much for any way you can point me in a healthy direction here.

+1  A: 

As a general concept, caching doesn't imply a strategy for implementation. The prevalent idea is that you store the information somewhere that provides you more efficent access than where you obtained the data from originally.

So in this case, it's more efficient to get the data from the disk than it is to requery Twitter (generally, network latency is greater than disk IO latency).

Also, getting data from memory is more efficient than getting the information from disk (because memory latency is less than disk IO latency).

That being said, you can store the values from Twitter in memory, if you wish, or to a file on disk, if you need the values to persist beyond say, a shutdown. How you do it is up to you (disk or memory, extensions, format, etc, etc). It's your cache.

The thing you have to be careful of is the cache growing stale. This is when the information that you have in your cache is out of sync with the original data source. You have to make the determination for your application how acceptable stale data is, and requery as appropriate, replacing your cache values.

casperOne
Thanks so much for a thorough answer, casper. This helped beyond just "how to." I ended up going with a PHP solution that rewrites an XML file if it grows older than 10 minutes. Any particular reasons you think I should be scared of this setup? Seems API-sensitive while keeping relatively recent data.
Joshua Cody
+2  A: 

Joshua take a look at this article about simple PHP caching, you can do exactly as you've described with relative ease (and yes is probably the most sensible way to go).

fire
Joshua Cody
A: 

When twitter speaks of caching here, they are not speaking of PHP's APC, Memcache or other PHP caching mechanism. Twitter suggests that you keep a copy of the data somewhere on your server on a short term basis so that you do not need to request the same item from twitter twice during a user session.

There are two reasons for not repeating the query, 1) you will run into twitter's api limits soon and 2) your app will appear to be slow.

What you can do is to create a simple database table where you have three fields retrival time, url hash and the data retrieved. You can first query this database before sending the request to twitter.

Getting back to PHP caching systems; yes it's possible to save your twitter data there but that's not what those caches are intended. What you should save in the APC are items of data that you will be using very often. This is usually kept in memory.

e4c5