views:

454

answers:

15

What arguments can be used against using zip files of source code as a form of version control?

In general each developer is working on their own program and has a responsibility for it. But there are times of course when other developers are involved in work on that program.

Each developer has their own naming convention for zip files ranging from appending the date, a number after the program name or even appending _old / _oldold _newversion etc… When there is collaboration on development of some code. It has to be checked who has the ‘latest’ version of the code – and where it resides, usually the correct version is identified.

There is no easy existing method to diff source trees and during development unwanted changes occasionally slip into code.

The zip file corresponding to software releases that have release to manufacturing are archived. This at least adds some traceability.

Also before RTM there the code is peer reviewed against the previously released version so quality assurance does exist.

Are there any formal white papers explaining the advantages of source control, making clear that the above isn’t a fully valid form of source control? Arguments exist here that since the end product (manufacturing releases) are under control and these are reviewed that there is no problem with the process. Developers do not have too much of a problem working with zip files in this way, but may not be aware of the advantages.

+15  A: 

The best argument is surely that using a version control system like Subversion or Mercurial is much, much easier and more secure than messing about with zip files. I doubt there has been much paper writing on the subject, as the use of zip files for this purpose is fairly obviously wrong.

There are a number of SO questions on the general advantages of version control. For example http://stackoverflow.com/questions/1124363/how-can-i-convince-my-department-to-implement-a-version-control-system and http://stackoverflow.com/questions/250984/do-i-really-need-version-control

anon
I don't get the more secure argument. Why is storing information in a ZIP file any less secure? I also think that the assertion that the use of zip files is obviously wrong is a bit extreme. Technically the zip solution (if implemented carefully) is just an extremely cumbersome and feature poor implementation of a SCM.
JohnFx
Version control systems support user access control, zip files generally don't. I agree about the feature poor and cumbersome systems (particularly if add "unreliable") , which is why I specifically mentioned two that are not.
anon
Another good point. As for user access control, I'd lump that in with the unsupported features that you'd have to make other accomodations if you went the ZIP approach. Hence the "feature poor" comment.
JohnFx
Secure might also mean that with ZIP you can physically (accidently) delete one file from it, while with a source control system, you must go to great lengths to permanently delete something.
Florin Sabau
+15  A: 
  • Creating and managing zip files is error-prone.
  • Real source control gives you tools to understand your code:
    • History browsing
    • Diffs between revisions
    • Annotation of source files to track the origin of a change
  • Real source control isn't difficult, there's lots of help out there.
Ned Batchelder
Don't forget `blame` :)
voyager
That's "Annotation of source files to track the origin of a change". I love that svn has three aliases for that command: annotate, blame, praise!
Ned Batchelder
Don't forger `bisect` to find errors, and **merging**.
Jakub Narębski
@Ned: you are right :)
voyager
+1  A: 

It's not good as only creating a zip before a release means loosing a lot of power you get with version control.

Useually you should check in to the repository after you have added/removed/changed a functional aspekt. So that you can go back later when an error occurres that you think migth be because of this change. Or when you say "dammed this worked before the file format changed in someday in march." Naming revisions after changes makes it also easier to remember because you forgot what was done on 27 march 2009.

Lothar
+6  A: 

I am assuming that this question was asked because the original poster is working in an office where the standard practice is to share zip files.

Zip files are obviously bad, for the reasons given by Ned Batchelder. The biggest reason I would suggest is that it's clunky, and difficult to merge changes, or get diffs between past revisions easily.

I would recommend you read A Visual Guide to Version Control for some good arguments about why version control systems are very useful, and a superior way of managing code.

BrianV
+1  A: 

In general each developer is working on their own program and has a responsibility for it. But there are times of course when other developers are involved in work on that program.

In a normal development shop, this is not at all true. Different people work on the same source code all the time. XP makes it almost mandatory. Even if you separate the code into modules, there will still be interaction points with code that concerns at least two programmers.

Of course, it's almost impossible to collaborate without major problems if you don't use source control. But the scenario you describe is much more a way to adjust to this limitation than a sane project structure.

Having only a single person working on a module means that nothing will happen when that person is on vacation and you have a major problem when he leaves the company, gets sick for a long time, or dies.

Michael Borgwardt
+12  A: 

I assume you currently work at a company that practices this method of zip control, and you're looking for ammunition to help you change this practice. There are a lot of questions on StackOverflow about source control, and the community here are in near-total consensus on the benefits of proper source control and the horrors of working without it (for very good reason).

I'll add something here to benefit your battle: YOUR COMPANY IS @$#%&$#@ CRAZY!!! ZIP FILES??? ARE YOU @#$#@% KIDDING ME???

MusiGenesis
I've been involved in a number of, errr, "interesting" development setups in my many years at many places. Nothing really causes me much excitement anymore.Your second paragraph perfectly describes my reaction to zip files though... :-)
Brian Knoblauch
@Brian: someone had to say it. :)
MusiGenesis
My reaction to zip files would probably be "o.O Why?!" followed by creating a Linux VM on my dev machine and setting up an SVN server.
Adam Jaskiewicz
@Adam: that sounds like fighting fire with water. Who does that? :)
MusiGenesis
Could be worse - they could be using SourceSafe.
soru
+1  A: 

How do you do a merge? How do you do an annotate? How do you bisect? Where are changelogs stored? Just go to wikipedia and look up "Version control" and go down the list: zip files can kind of sort of do about 2 things out of the whole page.

This is like asking "What arguments can be used against shorthand as a form of double-entry bookkeeping?". It's a completely different thing.

These arguments are a little weak. I'm betting that the development efforts in this shop are small enough that these issues don't come up much if at all. If anything, more advanced concepts in SCM like branching are going to do more to scare them away than draw them in. Especially given that they are probably using ZIP because someone thinks it is the simplest solution that works.
JohnFx
@JohnFX: One thing about playing Devil's advocate is that you're always on the losing side.
Robert S.
JohnFx: I use bisect on solo projects that are under 1000 LOC. It's hard to imagine a company project so small it couldn't benefit from this.
+6  A: 

I suspect there'll be as many white papers comparing zip files to proper source control as there'll be white papers comparing cutting one's genitals off with a rusty butter knife with buying a puppy.

ceejayoz
made me chuckle :)
Darknight
+4  A: 

The ZIP solution requires a pro-active step at the end of the development cycle when things tend to get dropped because no one outside the dev group notices when they doesn't happen. Sort of like that final code cleanup you always plan on doing when things slow down.

An SCM integrated into the dev environment pretty much enforces/encourages keeping a version history with a small amount of effort all the way through the process. This makes it more likely that a version history will actually be created.

On Using ZIP as a SCM
I'm not going to take as hard of a line as some of the others on the ZIP file solution. It is at least better than nothing. It is a perfectly valid way of keeping version histories, it is just a lot more labor intensive, error prone, and lacks a lot of useful features.

Know who you are selling to

Someone in the Dev Group: Focus your arguments on features like ease of troubleshooting by using change histories, safety to experiment with big code changes (because of rollback), and avoiding accidents where work is overwritten by other developers.

Non-Tech Managers/Bean-counters: There are free/low-cost tools that will reduce the labor cost of version control and give greater accountability/transparency into what each developer is doing/the source of coding mostakes.

JohnFx
+5  A: 

These people already know all the arguments for SCM, there is nothing anyone can say to them that will sell them on it. These things must happen:

  1. You install SCM on your local machine and use it. If you must, have it autogenerate these .zip files at every build, so no one outside your cube knows the difference.

  2. Some kind of disaster occurs, like loss of work, show-stopper bug is re-introduced or some other worst-case scenario that is the real reason we all use SCM (the other features we learn to appreciate later).

  3. You are unaffected by the disaster, and/or use your personal copy of the code in SCM to fix the problem/recover the lost work/whatever.

  4. You are a hero and everyone wants to know how you did it.

Only by experiencing firsthand the pain of loss caused by poor SCM practices will your organization realize the benefits of SCM. You're smart enough to learn from the mistakes of others, but not everyone is. The rest of the time, you'll just be 2/3X more productive than the rest of the team and maybe, just maybe they'll wonder how.

By the way, this is how you get agile, continuous integration, unit testing, etc into the organization: lead by example.

Chris McCall
+1  A: 
Norman Ramsey
+1  A: 

I haven't seen an answer include Eric Sink's Source Control HOWTO, but it's a valuable reference. I haven't seen any formal white papers on version control, but I'm not sure the argument about "validity" is your strongest one. The problems you describe in your question indicate some pretty serious drawbacks with the current approach. If "the powers that be" in your environment aren't convinced by that, change the argument entirely.

If you make it a question of quality control, and point to continuous integration as a practice that encourages it, then the zip file approach to version control isn't a "not fully valid form of version control", but an obstacle to implementing continuous integration as a practice.

Your question doesn't indicate whether or not the end product "under control" is tested in any automated fashion (in addition to being reviewed). If the process you describe would prevent that from taking place as well, certainly add that to your argument too.

Scott A. Lawrence
You beat me to it - this time! :-)
Jeanne Pindar
+6  A: 

Zip files work as a very basic form of version control. It's a way to separate "states" of the source. However, it's not a good form of version control because you have to do a lot of work to perform basic source control management tasks. For example:

  1. Bob's team is working on a major feature that requires changing dozens of files. He works in his own private zip-controlled area for a while. He's created 30 new files, added features to 12 existing files, and made changes to existing behavior in 3 existing files over 4 months. How do you merge Bob's work with the main trunk that has also evolved over the last 4 months? Do you hand-diff thousands of lines of code and decide how to merge them? How do you ensure that anything that uses the 15 existing files isn't broken? How do you ensure that Bob's features or main trunk features aren't accidentally omitted?
  2. Alice is investigating a bug in her code and realizes that one of Sam's classes has changed its behavior. Sam says he didn't make the change. How does Alice find when and why the change was made? How does Alice know who depends on the change?
  3. A major customer has reported a bug in an older version of the program. This customer needs a fix and is important enough to warrant a patch. How do you add the code to the old zip file in a way that it also exists in the new files? Also, how do you record that there is a relationship between the two changes?

These are just three scenarios that a version control system handles well. Situation 1 is handled by development branches. Almost every version control system has a notion of branches that can be developed in parallel and merged as needed. Situation 2 is easily addressed by any source control system with a "blame" feature and less easily addressed by just searching commit logs. Situation 3 is a variant of situation 1, but when you merge branches most version control systems make a note. For example, you'd make a branch off of the old version, fix the bug, then merge that branch into the new code. Now when someone asks "Where did this change come from?" they see it was merged from the patch branch and the change was made to fix a bug.

By the way, I've been in each of these 3 situations and used both SVN and Perforce; both made finding a solution very easy.

OwenP
+2  A: 

I wrote a Version Control tool long ago for a company who did the authoring for DVD titles. Before that they had nothing, just a directory full of clips, icons, scripts etc. which anyone could hack away at, and no way to backtrack if it went wrong etc. HOWEVER these people were 'artists', not programmers, so they could not (would not???!) be trained to use a decent Version Control system. So as a bare-minimum, get-out-of-the-mud level tool I wrote a utility which zipped up the current state of the directory, gave the Zip a meaningful name (date + comment supplied by user) and stuck it in a Backups directory, and also allowed you to restore one of these backups.

So zips CAN provide minimum-level version control, and I speak as someone who endorsed that approach when it was right for the skill-level (in terms of programming, I don't want to imply that they couldn't manipulate pixels!) of the people using it.

But as a programmer, you should be thinking to use a tool which really helps you. As such you want to be able to compare differences for individual files, compare differences between complete milestone sets, and (if you are working on anything other than trivial programmes) handle branching and merging. If you want these features you need something BETTER than zip files.

I used to use ComponentSoftware RCS, and if it wasn't for its poor performance over a WAN we might still be using it: it is cheap (even free for single-developer use, in which form I used to use it at home) and simple to use. However nowadays I would suggest looking at SubVersion. It is very flexible, reasonably simple to understand, has a good set of Windows tools to make it even easier (e.g. Tortoise, Ankh), and ... best of all ... you can get it running for free.

AAT
A: 

I think your best argument is showing a GOOD form of source control and showing how powerful it is. Don't trash what is currently being done (as someone is surely emotionally attached to that). You don't want to trash the "ZIP Source Control Method." Show the power of something like SVN. Make it very easy to explain. Show common use cases. (A solid demo would help.)

Let the source control version sell itself.

JasCav