views:

203

answers:

2

The book 'Git Internal' mentions about using git as peer to peer content distribution network at p50, but doesn't have much detail. Especially how to have several branches which has tracking different files. Example:

(working dir) a00.exe a01.exe b00.exe c00.exe c01.exe c02.exe

Master branch tracks all files, while branch A only tracks a00.exe and a01.exe, branch B tracks b00.exe and so on. The next commit will update a00.exe b00.exe c00.exe. How to create branches like this? Once all branches committed, can I only fetch certain branch from remote? Thanks!

+3  A: 

There was a video/talk about that topic by the git internals author scott chacon, he talks about a content distribution network for ads in some kind of mall. inspiring: http://www.techscreencast.com/language/ruby/using-git-in-ruby-applications---scott-chacon-/1431

The MYYN
+5  A: 

You will need to have a script of some sort build the various branches of content for you. The basic way to do this is to add the content to the database (in your case, just by committing them to the master branch), then in a temporary index, reading in all the contents you want to have in each branch (git read-tree/git update-index), writing that tree out (git write-tree), writing a commit object (git commit-tree) and updating the branch to that new commit (git update-ref). These are all plumbing commands that are not normally used in day-to-day operations but allow you to build snapshots without having all the contents in a directory on disk at the same time.

An example script to do something like this is here:

http://github.com/schacon/gitcrazy/blob/master/update_content.rb

Here I define a number of servers that each have one or more roles ('memcache', 'database' or 'webserver'). Then I can add content to a role like this:

$ update_content.rb /path/to/content file_name memcache

That will add the content to my git db, then update the branches for the servers that are affected (that have the memcache role, in this case). I can do that for multiple files for any of the roles and git will keep track of what content each server should have. Then each server can fetch their specific branch ('server/s1', 'server/s2', etc).

I'm thinking of doing a quick screencast demonstrating this soon - hope the example script is helpful. It should be pretty easy to run and figure out what's going on. In the same project there is a 'list' script that lists out what content is on which server branch.

Scott Chacon
Thanks for such helpful information. In the Ruby script, where does the mode 100644 come from in this command 'git update-index --add --cacheinfo 100644 (sha1) (path)'? Thanks!
Scud
and in line 43:pcommit = (prev_commit != "servers/#{server}") ? "-p #{prev_commit}" : ''the prev_commit is sha1 where servers/#{server} is string like "master", so it will never equal to each other. Is this supposed to be this way? Thanks.
Scud
when you add content to the index (with update-index) you have to give the file a mode - either 100644 which is a normal file, or you can make the file executable (100755, i think) or a symlink. It's just a unix file mode.as for the prev_commit - it will equal a sha1 value if there is a previous commit, but if there is no previous commit - if servers/{server} is not yet set to anything, the rev-parse command will return the string you gave it since it can't figure out a sha for it. that's how i can tell if there is no first commit yet.hope that helps.
Scott Chacon