views:

276

answers:

4

I'm struggling with the automated data collection of a PHP script from a webserver. The files in question contain meteo data and are updated every 10 minutes. Weirdly enough, the 'file modified' date on the webserver doesn't change.

A simple fopen('http://...')-command tries to get the freshest version of the last file in this directory every hour. But regularly I end up with a version up to 4 hours old. This happens on a Linux server which (As my system administrator has assured me) doesn't use a proxy server of any kind.

Does PHP implement its own caching mechanism? Or what else could be interfering here?

(My current workaround is to grab the file via exec('wget --nocache...') which works.)

+5  A: 

Since you're getting the file via HTTP, I'm assuming that PHP will be honouring any cache headers the server is responding with.

A very simple and dirty way to avoid that is to append some random get parameter to each request.

nickf
This does indeed solve the problem, but doesn't really answer the question: Why does this happen?
christian studer
A: 

So if I'm understanding you correctly, part of the problem might be that the *.dat file always has a timestamp of 1:00 AM? Do you have control of the server containing the data (http://www.iac.ethz.ch/php/chn_meteo_roof/)? If so, you should try to find out why the data always has the same timestamp. I have to believe it is being intentionally set--the OS will update the timestamp when the file is modified unless you go out of your way to make it not do so. If you can't figure out why it is being set to 1AM, you could at least do a "touch" command on the file, which will update it's modified timestamp.

This is all, of course, assuming you have some access to the server providing the files.

Kip
No, I don't have control there.
christian studer
A: 

why dont try using curl, I think this is a more proper use for this.

Gabriel Sosa
curl has the same problem. As does wget unless I use the --no-cache parameter.
christian studer
yes, but you can set a curl option for that and have a better code flow than running exec. Regards.
Gabriel Sosa
A: 

maybe this can resolve your problem (POST request can't be cached as far i know)

$opts = array('http' =>
  array(
    'method'  => 'POST',
    'content'=>''
  )
);
$context  = stream_context_create($opts);
$resource = fopen ('http://example.com/your-ulr', 'r', false, $context);

/* or you can use file_get_contents to retrieve all the file 
   $fileContent = file_get_contents('http://example.com/your-ulr', false, $context);
*/
Eineki