views:

412

answers:

2

Hello -

I have several web applications in production that utilize NFS mounts to share resources (usually static asset files) among web heads. In the event that an NFS mount becomes unavailable, Apache will hang requesting files that cannot be accessed, the kernel will log:

Nov 2 14:21:20 server2 kernel: nfs: server server1 not responding, still trying

I reproduced the behavior in RHEL5 running NFS v3 and Apache 2.2.3:

  1. Create an NFS Mount on Server1 (contents of my /etc/exports)

    /srv/test_share server2(rw)

  2. Mount the NFS share on Server2 (contents of my /etc/fstab)

    server1:/srv/test_share /mnt/test_share nfs defaults 0 0

  3. Setup a virtual host in Apache with a simple HTML file referencing image files stored on the NFS sharen

  4. Load the site, the html and image files all return 200

  5. Unmount the NFS Share, loading the page returns 404s for the images referenced

  6. Remount the NFS Share

  7. Simulate an NFS crash by turning NFS off on Server1 - reloading the site hangs retrieving the referenced files.

Internet searches so far have not turned up a good solution. Basically the desired behavior would be for the web server to return 404s and not hang until the NFS mount recovers.

Cheers,

Ben

A: 

I would not directly serve from the NFS mount, but instead from your local filesystem.

It wouldn't be too hard to setup a cron job that synced the NFS mount to the local file system every few minutes. Apache would serve its content from there, not depending on the NFS mount. If the mount goes down, Apache would still be able to serve the assets, although they might be out of date until the NFS mount comes back up.

phantombrain
That would be ideal, and a solution we have used, but won't work in the situation where the webhead has a small 50GB hard drive and there are 250GB of assets. We have also utilized S3 as an alternative, but a lot of the files need to be manipulated and are more than just static assets.
benr75
+1  A: 

couple of options:

  • get your nfs mount options right, you need to do a soft mount so nfs access can be interupted. try soft,intr,timeo=10 instead of default
  • sync your document roots with something else like rsync, or script yourself a semi-atomatic checkout/export from your SCM, if you use one. SCM use is recommended anyway, gives you the possibility to revert to the last working version, for instance
  • use a real distributed filesystem (preferably fault tolerant like coda) or even a distributed block device system like drdb

option 2 and 3 give you disconnected operation and are therefore much more robust than nfs. drdb is sexy, but my advice would be option 2 with somwething like git or svn, simple and robust

pfote
Soft mounting and setting the timeout value does allow for apache to return 404s. Next step: monitor the mount... I set my timeout a bit lower to 2. When NFS mount became available things were served properly again.
benr75