views:

783

answers:

11

I've never worked on a professional project with a team, as I'm still in high school. As a consequence, I've never been exposed to this whole "versioning" and "source control" thing. Are they the same? How exactly does a program that manages code manage code? I've heard you have to check out code (copy the existing code?) and merge it back in (what happens if someone changes code that you didn't change and you change something else and merge it in? Surely, his code is not replaced by your older version.) And, finally, what is the best/easiest example of this type of software?

+7  A: 

I believe this article explains the answers to most of your questions.

YonahW
+1  A: 

Here are some answers to your questions:

Source control is a place where everyone shares their code. When you want to make a change, you "check out" the code and modify it. While you have it checked out, no one else can make changes to it (thus preventing you from modifying an older version of the file). When you are done, you check the file back into the source control, creating a new version (this is the versioning portion of source control). At any point in time, you can resynchronize your local copy of the code base with the latest versions of all the files, keeping you up to date.

Some software that does this uses a change system. That means that there is a single copy of the file which, whenever it gets checked in, is saved as a sequence of changes to that file. Other systems store multiple copies of the file, and attach version information to the file.

If you were to change an older version of the file, and then discover that you worked off an old version (the source control program should notify you when you try to check in your code), then you would have to merge your changes with the newer version. Some source control tools come with a built in merge tool.

Elie
+3  A: 

Versioning and Source Control are the same. Well, mostly. Versioning is what Source Control does. It allows you to have multiple versions of a file. Think of it like Wikipedia's revision history. You can go back, see what was done, compare changes, and roll back to previous code.

The program normally uses some proprietary engine to keep tabs on these files (since they're just text, it normally isn't tough), and it keeps track of changes made each time the file is checked back in.

If you try to merge a file back in that's been changed, most Source control systems will show you a diff of the files and you can choose to merge it manually or automatically (depending on how powerful your Source control system is).

The easiest example is to download Subversion onto your USB drive, and install Tortoise SVN as well.

George Stocker
+9  A: 

Eric Sink has written a nice Source Control HOWTO.

Disclaimer: Eric Sink is a principal of SourceGear which develops and sells several source control-related tools (including SourceGear Vault). However, the Source Control HOWTO is a pretty neutral, balanced writeup.

Michael Burr
+5  A: 

Version control systems aren't just for large teams! The individual developer working on a hobby project has much to gain from versioning their code.

Most version control systems are centered around the concept of a repository, which is managed by the system and never touched directly. When you want to edit a file, you copy it from the repository, make your changes, then send it back. The version control system identifies the differences between your new version and the old version, then stores your new version in the repository. But, in case you screwed up, the older versions, going all the way back to the beginning, are always retrievable.

Most version control systems are smart enough to merge changes from multiple developers on the same file, as long as the changes don't affect the same part of the file. Merge conflict can occur and cause headaches, however.

To get started, I recommend you install something simple and free like Subversion and go through their very detailed book which has examples and explanations of the various features, most of which work comparably in other version control systems.

friedo
I bullied a good friend of mine into using SVN for his small projects and code. He's since become a SVN junkie and is slowly moving all of his code library ( snippets and ideas) into SVN. Its made him a lot happier and productive.
David
A: 

Thanks a bunch guys. Makes a ton more sense now. Which version control software is the most popular or the best in your opinion? I see you guys like Subversion. What about Git? It seems a bit more complicated.

My gratitude again.

Dove
You can install subversion locally and use TortoiseSVN to easily do checkins/checkouts.
Todd Smith
+18  A: 

I've heard you have to check out code (copy the existing code?) and merge it back in (what happens if someone changes code that you didn't change and you change something else and merge it in? Surely, his code is not replaced by your older version.)

That is precisely the reason for version control. It wouldn't really be a very useful tool if all it did was blindly overwrite people's code, would it? ;)

Source control tools maintain a repository with the entire history of the codebase. Every change is checked in as a delta, saving just what has changed. That means that if you and I both check out version A, and I then edit file B, and you edit file C, and we both check in, the source control software will compare the differences and where no conflicts exist (as in this case), just apply both changes, and if conflicts occur (if we both changed the same lines of code), it rejects the attempted check-in and tells you which two versions of the file had conflicting changes, and asks you to merge them manually. (usually it is also able to highlight the changes so you can see visually what has been changed in each version). So it never overwrites changes.

Other tasks it'll do for you is:

  • Tag specific versions or milestones so you can easily find them again later (this is the version where we finally fixed annoying bug #2524, this is beta 1, and so on
  • Branch the repository into two, allowing changes that may go "out of sync" temporarliy, or even stay separate products forever (think of PHP simultaneously maintaining their PHP4 branch, while also working on PHP5. At some point they simly branched their codebase so while they started out identical, they can now apply patches to one without affecting the other). Of course it can also attempt to merge these branches back together (you may create a branch for each major feature in your product, perhaps, so they can be developed in isolation without being affected by the gradual changes happening to the rest of the code, and then when the feature is done, merge its branch back into the main repository)

There are two basic kinds of source control tools, the centralized ones and the distributed ones. Distributed are the big new thing, while centralized has been with us for decades. In brief, centralized version control simply means that there is one master repository server, where all branches and the change history for each are stored. This server is responsible for merging checkins and all that. So every time a developer checks code out or commits it to the repository, this is the server he syncs up against. Distributed ones simply ditch the "main server" aspect of this. Instead, every time you check out your code, you create a new repository locally on your own machine, at the path you check out the code to. Then you can work against this local repository, which does all the same things, tracking change history, merging changes and so on, and once you're ready, you can merge your repository into, well, any other repository. Typically, you'll probably want to merge it into some kind of "main" repo where all the code gets glued together, but you can also merge it into your codeveloper's local repo, if perhaps he needs that feature you've been working on, but it's not yet stable enough to go into the main repo. So it gives you a lot more flexibility, and allows you to maintain a change history of your local work, without risking breaking the build at the master repository (in a centralized setup, what happens if you check in something that doesn't compile? The entire team is screwed until it's fixed. And if you don't check it in, you lose the benefits of version control, preventing you from reverting to a previous version if you find out you've been introducing new bugs)

And, finally, what is the best/easiest example of this type of software?

Hm, the most popular is easily SVN, which an old-fashioned centralized system, but it's a bit of a pain to set up imo. Requires special software on the server. I personally am quite fond of Bazaar, which is distributed, and a server repository only requires FTP access and nothing else on the server.

jalf
Now this is thorough. Thanks a big ginormous bunch!
Dove
ftp access is not secure comms though... not a good thing for people who want security/privacy...
Tim
SVN can also function as svn+ssh, so no need for the webdav or svn server extensions.
David
Tim: Fair enough, it works with SFTP as well, my point was simply that you don't need anything bazaar-specific installed on the server. Just access to the file system.David: svn+ssh requires svnserve installed though, doesn't it? Otherwise I don't see how it can function as centralized vers.control
jalf
Only if we had VisualBZR for Visual Studio... 8->
Mehrdad Afshari
Thanks for the information.
Tim
I'd upvote if he had included a part on Git.
Kelly French
A: 

Answering the second question you posted:

SVN requires a centralized repository while Git is decentralized but the majority of the people I know who use it rely on github. Also Git is used for the main trunk of the linux kernel source code so its been optimized a bit more for very large code bases. Also Git seems to have a better handle on branch merging then pre-svn 1.5.x code.

David
A: 

Versioning doesn't need to be a complicated setup nor something that is only suitable for working in a team environment and it isn't only for coding projects. Anything can go into version control. It's simply a matter of not wanting to lose anything. I version control virtually every file i edit, including everything in my home/ and lots of config files in etc/ and elsewhere and i version everything i do for work (in addition to the versioning provided for the projects). It's such a part of what i do, i think it should be provided by the filesystem.

I recommend starting simple and practice with RCS. Although RCS sucks for teams, it is just fine for managing local user files with only a single author. Don't concern yourself with merges and branches just yet. Learn to think of saving a file as meaning saving a version. Once you're comfortable with RCS and accept the benefit of retaining changes, you can move up to learning SVN or Git or any other real system which are far more robust and better suited for team environments.

nicerobot
+4  A: 
dbr
A: 

Also take a look at the Streamed Lines paper it describes a number of best practices and version control anti-patterns, I'd also second the reference to Erik Sinks articles mentioned by Michael Burr

Richard