views:

429

answers:

4

I am wondering if anyone has any plugins or capistrano recipes that will "pre-heat" the page cache for a rails app by building all of the page cached html at the time the deployment is made, or locally before deployment happens.

I have some mostly static sites that do not change much, and would run faster if the html was already written, instead of requiring one visitor to hit the site.

Rather than create this myself (seems easy but it lowwwww priority) does it already exist?

+4  A: 

You could use wget or another program to spider the site. In fact, this sort of scenario is mentioned as one of the uses in its manual page:

This option tells Wget to delete every single file it downloads, after having done so. It is useful for pre-fetching popular pages through a proxy, e.g.:

   wget -r -nd --delete-after http://whatever.com/~popular/page/

The -r option is to retrieve recursively, and -nd to not create directories.

Ant P.
+1  A: 

I have set integration tests that confirm all of the main areas of the site are available (a few hundred pages in total). They don't do anything that changes data - just pull back the pages and forms.

I don't currently run them when I deploy my production instance, but now you mention it - it may actually be a good idea.

Another alternative would be to pull every page that appears in your sitemap (if you have one, which you probably should). It should be really easy to write a gem / rake script that does that.

RichH
A: 

Preloading this way -- generally, with a cron job to start at 10pm Pacific to and terminate at 6am Eastern time -- is a nice way to load-balance your site.

Check out the spider_test rails plugin for a simple way to do this in testing.

If you're going to use the wget above, add the --level=, --no-parent, --wait=SECONDS and --waitretry=SECONDS options to throttle your load, and you might as well log and capture the header responses for diagnosis or analysis (change the path from /tmp if desired):

wget -r --level=5 --no-parent --delete-after \
  --wait=2 --waitretry=10  \
  --server-response        \
  --append-output=/tmp/spidering-`date "+%Y%m%d"`.log
  'http://whatever.com/~popular/page/'
mrflip
A: 

I use a rake task that looks like this to refresh my page cached sitemap every night:

require 'action_controller/integration' ActionController::Base::expire_page("/sitemap.xml")
app = ActionController::Integration::Session.new app.host = "notexample.com" app.get("/sitemap.xml")

See http://gist.github.com/122738

mpearce