views:

29

answers:

2

Hi,

I want to create a script that checks an URL and perform an action (download + unzip) when the "Last-Modified" header of the remote file changed. I thought about fetching the header with curl but then I have to store it somewhere for each file and perform a date comparison.

Does anyone have a different idea using (mostly) standard unix tools?

thanks

+2  A: 

A possible solution would be periodically running this algorithm on the client box.

  1. Create a HTTP request indicating the If-Modified-Since header equal to the date of your local file. If the file does not exist yet do not include this header;
  2. The server will either send you the file if it was changed since the If-Modified-Since header in the payload or send 304 Not Modified HTTP status.
  3. If you receive a 200 OK HTTP status simply get the payload from the HTTP body and unzip the file.
  4. If in the other hand you received a 304 Not Modified you know that your file is up-to-date.
  5. Use the Last-Modified header to touch your local file. This way you will be in sync with the server datetime.

Another way would be for the server to push notifications (a broadcast package for example) when the file is changed. When the notification is received the client would then execute the above algorithm. This would imply code to live in the HTTP server that listens for file system changes and then broadcast them to interested parties.

Perhaps this info for the curl command is of some importance:

TIME CONDITIONS

HTTP allows a client to specify a time condition for the document it requests. It is If-Modified-Since or If-Unmodified-Since. Curl allow you to specify them with the -z/--time-cond flag.

For example, you can easily make a download that only gets performed if the remote file is newer than a local copy. It would be made like:

curl -z local.html http://remote.server.com/remote.html

Or you can download a file only if the local file is newer than the remote one. Do this by prepending the date string with a '-', as in:

curl -z -local.html http://remote.server.com/remote.html

You can specify a "free text" date as condition. Tell curl to only download the file if it was updated since yesterday:

curl -z yesterday http://remote.server.com/remote.html

Curl will then accept a wide range of date formats. You always make the date check the other way around by prepending it with a dash '-'.prepending it with a dash '-'.

To sum up, you will need:

smink
Nice :) `curl -z`. Too bad the http server seems to ignore if-modified-since :( But maybe curl will fix it. Ill try :)
ZeissS
Have you got the format for `If-Modified-Since` right with the tool you use? See [here](http://www.w3.org/Protocols/HTTP/HTRQ_Headers.html#if-modified-since).Format as in [RFC850](http://www.w3.org/Protocols/rfc850/rfc850.html#z10) but GMT MUST BE USED.Anyway `curl - z` should avoid headaches on how to get the format for the `If-Modified-Since` header right.
smink
Yeah, got it right, but the damn oracle server seems to ignore it. It even sends the content when requesting with HEAD ;) But nevermind, it works now using `curl -z`
ZeissS
+1  A: 

is Java applicable in your case? I did a similar thing in one of my homework using the Apache HTTPcore library, you need to add the header "If-Modified-Since" to your HTTP request before you send it to the server, if the status code of the response that you receive from the server is not 304 then you know that the file has changed since the time value that you're checking against.

Noona