views:

62

answers:

2

Sorry for the crappy title.

I have a repository of product images (approximately 55,000 and growing by about 1000 a year) that changes daily (up to 100 images added, modified and/or deleted every day).

I need three people to have access to making the above changes (so they can read/write to the directories). They will all be using windows vista PC's.

I also need to be able to host the images so that vendors can stay up to date with the changes on a daily basis. There are about 100 vendors.

The system I am thinking about implementing now would involve using Subversion.

In the trunk I would have the images (broken down into multiple directories and sub directories). The three people would have working copies on their local machines so they can make the necessary changes and we wouldn't have to worry about conflicts. Plus everybody could easily stay up to date with the repository (not to mention the obvious benefits of versioning and backup).

I would have a public, read only, url to the trunk so that the vendors can only checkout changes. This is good because I could offer them instructions on how to checkout the repository and setup a cron to update the repo daily, thus, always being up to date.

All of the vendors have enough technical expertise to setup a cron job and svn repo on their servers.

This all feels a little hacky (I consider anytime I try to use something that it wasn't designed for a hack).

My questions are, does anybody see any drawbacks to this solution? Are there any other solutions that may be better for what I am trying to do?

I considered using dropbox to sync across all of these servers but I don't want the vendors to be able to make any changes.

My goals are:

  1. Make maintenance easier for my designers.
  2. A one time setup for vendors and they will always be up to date with our images.
  3. Having a decent backup/restore and rollback system in place just in case of a crisis.
+1  A: 

You solution seems as good as any. All you have to do is a small client script that would retrieve the images from your server to the client.

The only thing I don't really like is the use of subversion to hold images. It works, but every change will not be stored as a delta, but rather as a full new file. This might not be a concern for you.

Also, the use of rsync will allow you to easily transfer the only changed files, saving bandwidth. Also, you might need to consider using SSH for the transfers.

One more thing, you might want to schedule the sync scripts to different hours of the day across your clients to even the load, and make it appear faster for them.

voyager
Good point on the changed images, but the full new file isn't much of a concern (unless the size of the full repository get's completely out of control.)I considered the rsync approach but many of the clients have the images on internal servers so the method needs to involve them pulling the images rather than me pushing them.
Bill H
+1  A: 

Your solution makes sense overall, though I share the concern about the size of a repository with that many images. At some level a content management system specifically intended for images might make more sense (though I don't know of one to recommend).

One thing I would suggest changing though is unless the vendors actually need to be able to edit/commit files, you might want to consider setting up a scheduled job to export to a file system (which is then made available via FTP/HTTP/etc). Having the vendors perform a checkout means they're going to have all the overhead of a working copy instead. (Then you end up dealing with the question of "What are all these .svn directories and what are these other files?")

ThatBlairGuy
That is exactly the problem I have ran into. The checkout process works OK but the .svn directories and files add so much overhead I am compromising performance (in a major way) for usability. Vendors shouldn't be able to make any repo changes (they would be given read only access).It's hard to believe this problem hasn't been solved before. I am going to keep on experimenting I suppose. Thanks for the help!
Bill H
That's why I suggested you use export instead of checkout. Just to be avoid any ambiguity, what I'm suggesting is "svn export" instead of "svn checkout" It won't mirror any deletions, but it will copy everything over. Overall, rsync is a better solution: less bandwidth, mirrors deletions. I believe it can be initiated from the vendor's internal system (not something I use very often).
ThatBlairGuy
The problem with svn export (and correct me if I am wrong) is that export will re-download everything in the repo every time it's called. I've used rsync plenty of times but I have to be able to cater to a multitude of operating systems (some that don't support rsync).Thanks for the suggestions.
Bill H
I believe you're correct, export will download everything, every time, while checkout will bring down all the .svn folders on the initial download. (After that, of course, you'd use update.) Neither solution is a clear victory. I know rsync is native to *nix, but I've used a cygwin port on Windows. If neither svn solution is optimal, it might be worth looking for other rsync ports.
ThatBlairGuy