views:

649

answers:

17

How does version control differ from plain backups?

Let's forget about the feature decoration, and concentrate on the soul of version control. Is there a clear line that backups must cross before they can be called a VCS? Or are they, at heart, the same thing with different target markets?

If there is a fundamental difference, what is the absolute minimum requirement for something to reach the status of version control?

When you answer, please don't just list features (such as delta compression, distributed/central repositories, and concurrent access solutions) that most version control systems have or should have, unless they actually are necessary for a VCS by definition.

+8  A: 

The capability to perform branching and merging separates version control systems from plain backups. "Multiple concurrent universes".

See also Eric Sink's excellent version/source control guide.

Ash
Is this really true? Without branching, nothing can call itself a version control system, even if a very limited one?
Some revision control systems, like those in wikis or Google Docs, don't support branching and merging; while it's very useful, especially in the realm of software development, I don't think it's a necessary feature of VCS by definition.
Miles
Branching and merging can be obtained on backups with copy, diff, and a suitable naming convention.
Peter Boughton
@viipaloitsija, it's just one of the first things I look for when evaluating version control systems. You all make good points, but I find it an indispensable function of a VCS these days.
Ash
@Ash, there's an itchy line between indispensable and fundamental.(I don't know the answer to this question, so branching/merging might be fundamental too.)
A: 

I would consider backups a very basic version control system. I would not recommend it for lots of things, but if I have a small script on my home computer, I use dated backups instead of a full featured VCS. This is OK because I am the only one to change the file, so I do not need to worry about conflicts or who made a change.

gpojd
+1  A: 

Version control keeps a history of changed, and most version control systems only store the difference between two versions, not everything. Backups store everything, and they have no history unless you do it manually. Backups are inefficient usually. However version control systems are not very efficient with binaries.

Malfist
A: 

Version control is basically an automated backup system that allows multiple users to contribute. There are absolutely more features involved with software like CVS, but yeah, it's a backup system under the hood. That doesn't mean you should manually backup instead of use version control, though, they're just in the same niche of computing.

+3  A: 

Here's a few

  • A backup system keeps backups for a given range of time back. Version control typically always keep all the versions ever made.
  • Version control typically concentrates on versioning of text files.
  • In A version control system you typically can get immediate access to any version of any file. A backup system may take some time before having access to what you want to find.
shoosh
+1  A: 

There's certainly a grey area between them, however I would define as follows:

Version control is triggered by a 'write' action, where as backup is generally triggered by a time interval.

Backup software can be configured to run every second, and not store data if no changes have occurred, but that's not enough for it to be considered version control in my eyes, since it is possible for a file to change twice in a second.

Peter Boughton
A: 

The bare minimum requirement for a backup system to have in order for it to be used for version control is incremental backup and restore for discrete backup items. Additional features (Collaboration, Branching, Diff Comparison) may make it a better VCS system, but insofar as you can control versions as long as you can have reliable access to retrieve and rollback to incrementally different versions of an item you have backed up, you can use it as a "VCS". So, I suppose the fundamental difference between a backup and version control system is what you are using the system to do. Particularly given that you could, if you desired, use your VCS as your backup system.

cmsjr
+23  A: 

The fundamental idea of version control is to manage multiple revisions of the same unit of information. The idea of backup is to copy the latest version of information to safe place - older versions can be overwritten.

Joonas Pulakka
To me, this sounds like a defining fundamental difference. I never really delete backups, so I didn't even think of it this way. Any objections from anyone more sophisticated than me for accepting this answer?
Agreed. In my view, this is the key differentiator.
Karim
Agreed, probably more fundamental then branching (but not much ;) great question by the way.
Ash
very nice explanation -- short and to the point!
Nik Reiman
And the backup of the latest version of the multiple revisions is very valuable! Nice answer.
Jonathan Leffler
I actually disagree with this definition. You confine the use of backups to one specific problem, i.e. recovery after some kind of data loss. Backups can be made for various reasons, and preserving the history of something is one of them. In that case, you cannot overwrite older versions.
@unwesen, do you feel that "preserving the history of something" does not belong to the domain of version control?
-1 good backup systems will store multiple versions. This is important, because the data loss is often not noticed immediately. If by the time you notice that you damaged an important file, the most recent backup might well already contain the damage.
oefe
Sure, my definition is not perfect, but I think it's pretty good. The main purpose of backup is to prevent data loss, but the purpose of version control is something quite else. Backup systems usually don't store the entire history even though they may store a few versions.
Joonas Pulakka
+2  A: 

In my opinion, here are some minimum features of a VCS which may not be in a basic backup:

  1. A VCS should store more than one versions (where as a backup might store only the latest last-known-good)

  2. Because of 1., each version should be identified somehow (date, tag, version ID)

  3. A VCS can typically support more than one concurrent user

  4. A VCS for source code will normally have support for branching, merging, adding comments, and viewing deltas

ChrisW
A: 

Perhaps they are fundamentally the same, until you add the word "good".

  • "good" VCS is very fast.
  • "good" VCS allows multiple sources of changes (multiple users).
  • "good" VCS allows merging.
  • "good" VCS has metadata, like user-provided descriptions

  • "good" backups are distributed geographically

  • "good" backups work automatically.
Jay Bazuzi
A: 

I think you can form arguments for either lumping backups together with VCS, or for treating them as entirely separate. But I think you can't avoid talking about individual features of a VCS, as it's the features that differntiate a VCS from a backup solution:

  • Keeping track of who made what change.
  • Attaching a note to each change to explain the reasons behind the change.
  • (Mostly) concurrent access by several users, possibly from very different locations.

In my eyes, these features are defining. If you ignore them, a VCS is essentially the same as an incremental backup solution.

If you look at a distributed VCS, you might find a stronger notion of keeping track of branches than in a non-distributed VCS. That is, there may not be a single head/trunk branch, but several at any given time. That's something no backup solution I've come across considers.

+2  A: 

Version control is collaborative, backup is just a snapshot.

Example: With version control two people can edit the same file concurrently, and the system is smart enough to merge the changes together. With backup, which version of the file would "win?" Backup never "merges" two different backups into one "true" backup.

Jason Cohen
+4  A: 

Version control represents the whole history of changes; backups try to make sure you don't lose it.

Darius Bacon
A: 

They are totally unrelated things. Think of version control as a kind of "time machine", which you can use to go back and forth in time with your code.

Marc
I wouldn't say totally unrelated. Having a "Copy of Thesis.doc" and "Copy (2) of Thesis.doc", in my opinion, blurs the line somewhat. So while they may not be the same thing, they seem to be related deep down.
+2  A: 

I see several fundamental differences between backups and version control:

  1. Backups only store the latest version, or, even if they store multiple versions, they don't store every version. A VCS does store every version,
  2. That backup version is often out of date, because backups don't record every change, while VCSs do,
  3. VCSs allow to pursue multiple alternative versions of the same change at the same time (i.e. branching).

However, the single most important difference between backups and VCS is that, in a VCS, changes have meaning. In a backup, a new version is made, because some computer somewhere decided that it was x hours since the last backup; the change itself is completely meaningless. In a VCS, a new version is made, because some human decided that this version has its own meaning, its own identity, different from all the other versions. So, in a backup, all versions are equal (more precisely: they are equally meaningless), whereas in a VCS all versions are special (they have their own unique meanings). In a VCS, changes have an actual history, where one event led to another, in a backup there's just a string of unrelated events.

Closely related to this, is the notion of change metadata. In a VCS, every change has an author, a timestamp and, most importantly, a commit message. This commit message records why the change was made, in other words, it records the "meaning" I wrote about in the previous paragraph.

The commit history and especially the commit messages are the most important data in a VCS repository, not the actual code itself! This metadata is completely absent in a backup.

Jörg W Mittag
A: 

At it's basic level there is no difference between version control and backups. A version control system is an incremental backup of every change that is made. A basic, non distributed VCS, like CVS used by one developer will simply create a backup of every change that is made to a text file.

Where version control moves beyond basic backups is in the additional tools that are provided to compare versions, merge changes made by multiple developers, tag versions for release or testing and conduct other operations that make managing these separate versions possible.

A: 

The basic difference that stands out to me is that version control allows multiple users to easily work on the same code. Backups do not.

Chance