views:

203

answers:

5

I've been developing websites for a few years now, and I've never had the time or energy to learn about version control. Now, as I start one of the bigger projects I've ever developed, I'm thinking of finally taking that plunge and using this as an opportunity to learn about version control.

I've read a couple of brief descriptions and but I'm still having some trouble grasping the concepts of centralized versus decentralized version control. What are the differences? Advantages/disadvantages?

I'm developing websites on OS X. For the last couple of years I've used a program called Coda to edit my HTML/PHP/CSS/JS and easily upload it to my server with a simple Cmmd + S. I've always kept a 'dev' directory for development, and a 'live' directory for production. Rolling out fixes and new features has always been as easy as updating the 'live' directory with my latest changes in 'dev'. With this project though, I'm expecting to hire some outside designers/developers for specific aspects of the site, which is where I think SCM comes in. Also, for the first time, I'll need a beta version of the site for users to test out new features and provide feedback.

As I understand it, each time I want to make a change, I'll have to fork (?) my own working copy. I don't have my working computer set up as a development server (no MySQL, PHP). How would I use version control using a remote server as the development server? Do I need working directories for each of my developers? How do you use version control in conjunction with MySql or other databases?

Also, I'm on a shared hosting server, so I'll be using a hosted version control system like Beanstalk or Github.

I'm looking for an entire workflow here, it seems like. What do you do?

I know this is a huge question, and I really appreciate everyone's input.

+3  A: 

Version Control with Subversion is a free online ebook, and while it targets Subversion (which works well) the principles apply to all Source Control systems. The Chapter on Branching and Merging clearly explains how to go about it.

There is a huge TFS whitepaper that covers just about every scenario you are likely to come across: Microsoft Team Foundation Server Branching Guidance but I recommend starting with the red bean book above.

Mitch Wheat
+5  A: 

It's not as bad as you think.

You'll learn to love it, because you'll never lose code, you'll be able to keep track of history, and you'll be able to roll back to any version you please.

each time I want to make a change, I'll have to fork (?) my own working copy

This is not true. You can commit changes to a working copy as you go until you're ready to release it. You'll tag and label that version (e.g., "maj.min.svn.build" where major is the major release number, min is the minor release number, svn is the subversion revision number, and build is the automated build number) and merrily go on your way revising the trunk for your next release.

You only need to fork ("create a branch") if you're doing parallel development.

Try Subversion. It's pretty good, even if Linus Torvalds hates it. The "SVN red bean" book is what you should Google and read. Or the Pragmatic Programmer book on version control.

I run CollabNet Subversion on my home machine and routinely check everything into it. It's a great way to practice. The fact that my stuff is safe is an added bonus.

duffymo
A: 

You may want to use subversion, and, unless you want to keep the production code in its own trunk (like a directory) then you won't have to fork a new branch, as you can just look at the differences between your current code and what is in production.

To start with, you could just install subversion on your computer and just check in changes. I tend to check in whenever I finish a feature, so that if I totally screw up when doing another change, it is quicker to just revert back to the last working version and start doing that feature again.

James Black
+5  A: 

git is one of the easiest things out there, honestly. subversion has a very large mindshare right now, and many of the people who have been using it have trouble learning git (different is hard), but if you don't have experience with either, one is not harder than the other.

The basic model with git is that you do some work and you record a snapshot of your work with a description of what makes it different from the previous snapshot.

It's trivial to see the difference between any two of these snapshots or perhaps "go back in time" and look at the entire state of your project at any prior point. All of these operations are roughly instant, and none require access to any particular server.

Being instant means that you gain a new freedom of experimentation. You will never fear doing some wild and crazy experiment that involves things like removing all of the css files and starting fresh. If it isn't working out quickly, you just toss the work away and go back. But being able to even try this will get you really far.

I like to describe this to newcomers as a well-managed undo coupled with a really awesome backup system. When you push your changes to another repository (e.g. github), you effectively have two copies of every state in your project. It quickly becomes impossible to lose work.

I'd like to emphasize that last point: If you have one computer you work on and you push your snapshots to github, the only way you can lose data is if both github is unavailable (or lost your data somehow) and your computer broke at the same time. If you have two computers you work on, three systems have to break. If you use git to deploy your tree somewhere, four computers have to break.

Dustin
+1 for this. Your answer reminds me that I still have to grok Git.
duffymo
+2  A: 

A few scenarios to illustrate how versioning can help:

You're collaborating with developer Joe on a large project. Most of the code depends on one class. Both of you decide to work on seemingly independent features, but find that that the class requires more member functions. In a non-versioning scenario, Joe and you will get together and decide how to refactor that class to address both your needs, and one of you has to finally write it while the other waits. In the versioning case, both of you can simultaneously modify the class according to your needs in separate branches and merge them- you'd only have something to discuss about if there's a conflict.

You're working on a small project independently. You've developed a stable version and released it. Now, you want to experiment a bit and refactor a lot of things- somewhere down the line, you realize you've taken a wrong turn and wish you could go back a few days (or even a few months for that matter). Versioning history makes this possible.

Team with two lead developers and ten codemonkeys. The lead developers design most of the project's architecture and assign small doable tasks to the junior programmers. They want to periodically review the junior programmers' code. The juniors can periodically push to their respective branches, which the leads will pull and review. The juniors might require a huge number of commits to complete their task. Instead of including all the trivial screw ups in the history of the project, the leads can choose to squash several commits together and then integrate the branch into the mainline. The leads find several bad lines of code- they can blame someone for it, because a versioning system always keeps track of who committed what.

A massive open-source project needs to manage a progress tracker, a bug tracker, mailing lists, community wiki, discussion foums, and ofcourse their versioned code. Launchpad is a fantastic platform that tightly integrates these things together; A commit corresponds to a bug fix. A release can magically be made by tagging. Because the public is not given write access to the main repository, the administrators can accept patches on the mailing list: a versioning system can easily diff the changes between the main project's and the user's code to produce a patch containing the changes.

As I understand it, each time I want to make a change, I'll have to fork (?) my own working copy.

No. You just have to commit.

I don't have my working computer set up as a development server (no MySQL, PHP). How would I use version control using a remote server as the development server?

Doesn't matter. Keep two copies of the repository- one on the development server and one on your machine. Keep them in sync by pushing your changes to the development server's repository.

Do I need working directories for each of my developers?

No, you just need separate branches.

How do you use version control in conjunction with MySql or other databases?

Versioning has nothing to do with databases. You can back up your databases separately.

I'm looking for an entire workflow here, it seems like. What do you do?

The details depend on the size of your team, your project requirements etc. Otherwise, there's a development branch, a production branch, and several branches for different developers. The developers branch off their own branches to write in independent features, and merge them back. The production branch pulls from the development branch and has tags for various releases. Usually, there are two deployments as well- the development which everyone's constantly testing, and the production for the general public.

Note: I've used distributed versioning system terminology here, but the same questions have equally good answers in the centralized versioning system case.

Ramkumar Ramachandra