views:

145

answers:

2

After reading http://stackoverflow.com/questions/3748/storing-images-in-db-yea-or-nay I think that the file system is the right place for storing images. But I would like to know how you handle backup/version control of uploaded images in your different environments (dev/stage/prod) and for network load balancing?

These problems is pretty easy to handle when working with a database e.g. to make a backup from the production environment and restore the DB in the development environment.

What do you think of using for example git to handle version control of uploaded files e.g?

Production Environment:

  • A image is uploaded to a shared folder at the web server.
  • Meta data is stored in the database
  • The image is automatically added to a git repository

Developer at work:

  • Checks out the source code.
  • Runs a script to restore the database.
  • Runs a script to get the the latest images.

I think the solution above is pretty smooth for the developer, the images will be under version control and the environments can be isolated from each other.

+1  A: 

It can work, but I would store those images in a git repository which would then be a submodule of the git repo with the source code.
That way, a strong relationship exists between the code and and images, even though the images are in their own repo.
Plus, it avoids issues with git gc or git prune being less efficient with large number of binary files: if images are in their own repo, and with few variations for each of them, the maintenance on that repo is fairly light. Whereas the source code repo can evolve much more dynamically, with the usual git maintenance commands in play.

VonC
Thanks for the feedback regarding git. My first idea was to use a repository completely separated from the source code but submodules seems to be another option.I feel pretty confident that it will be worth running a spike to see if it's possible.
orjan
+1  A: 

For us, the version control isn't as important as the distribution. Meta data is added via the web admin and the images are dropped on the admin server. Rsync scripts push those out to the cluster that serves prod images. For dev/test, we just rsync from prod master server back to the dev server.

The rsync is great for load balancing and distribution. If you sub in git for the admin/master server, you have a pretty good solution.

If you're OK with backup that preserves file history at the time of backup (as opposed to version control with every revision), then some adaption of this may help: Automated Snapshot-style backups with rsync.

Matt
Thanks for the feedback and I think distribution and backup is more important for us than versioning.
orjan