views:

3024

answers:

18

I teach the third required intro course in a CS department. One of my homework assignments asks students to speed up code they have written for a previous assignment. Factor-of-ten speedups are routine; factors of 100 or 1000 are not unheard of. (For a factor of 1000 speedup you have to have made rookie mistakes with malloc().)

Programs are improved by a sequence is small changes. I ask students to record and describe each change and the resulting improvement.

While you're improving a program it is also possible to break it. Wouldn't it be nice to back out?

You can see where I'm going with this: my students would benefit enormously from version control. But there are some caveats:

  • Our computing environment is locked down. Anything that depends on a central repository is suspect.
  • Our students are incredibly overloaded. Not just classes but jobs, sports, music, you name it. For them to use a new tool it has to be incredibly easy and have obvious benefits.
  • Our students do most work in pairs. Getting bits back and forth between accounts is problematic. Could this problem also be solved by distributed version control?
  • Complexity is the enemy. I know setting up a CVS repository is too baffling---I myself still have trouble because I only do it once a year. I'm told SVN is even harder.

Here are my comments on existing systems:

  • I think central version control (CVS or SVN) is ruled out because our students don't have the administrative privileges needed to make a repository that they can share with one other student. (We are stuck with Unix file permissions.) Also, setup on CVS or SVN is too hard.
  • darcs is way easy to set up, but it's not obvious how you share things. darcs send (to send patches by email) seems promising but it's not clear how to set it up.
  • The introductory documentation for git is not for beginners. Like CVS setup, it's something I myself have trouble with.

I'm soliciting suggestions for what source-control to use with beginning students. I suspect we can find resources to put a thin veneer over an existing system and to simplify existing documentation. We probably don't have resources to write new documentation.

So, what's really easy to setup, commit, revert, and share changes with a partner but does not have to be easy to merge or to work at scale?

A key constraint is that programming pairs have to be able to share work with each other and only each other, and pairs change every week. Our infrastructure is Linux, Solaris, and Windows with a netapp filer. I doubt my IT staff wants to create a Unix group for each pair of students. Is there an easier solution I've overlooked?

(Thanks for the accepted answer, which beats the others on account of its excellent reference to Git Magic as well as the helpful comments.)

+15  A: 

Subversion is easy to install, on windows, linux and mac os x. I don't know what program they are programming in, but the subclipse plugin for Eclipse is fairly easy to install and hides away some of the repository complexity.

And repository complexity? That's simply having a trunk, tags and branches folder within each project anyway. And they might not have much time, but they should get the time to learn SVN (or similar) because it is a skill that looks good on their CV.

JeeBee
I've had great experience using SVN, repo set up by others. I was told the repo was hard to set up. Maybe that info is obsolete? The key question is permissions: each *pair* of students needs a repo that only members of the pair can read and write. Pairs change each week. Unix files. How?
Norman Ramsey
Setting up a svn repo isn't *that* hard, but changing permissions the way you describe would be.
pjmorse
+1  A: 

I would say your best bet will be to try to work with your IT department to set up a system/method for your students to easily create new SVN/CVS repositories.

Probably you could get the IT department to give you the privileges necessary to create repositories for your students even if they won't give the priveleges to the students themselves. You could probably pretty easily write a few scripts to mass-create repositories from lists of students at the beginning of the semester.

SoapBox
+2  A: 

Subversion on Windows can be as simple as setting up TortoiseSVN. There is a bit of a learning curve for using it (especially if you've never used a version control before), but you might help that by dedicating half a lesson to it and providing some powerpoint slides for them to download.

As for centralization - I've heard of websites that offer free SVN project hosting. A quick Google search turned up this page, but there are certainly more.

Vilx-
+65  A: 

I would say something like git might fit the bill:

  • As it's a distributed system, you don't need to have a central repository, the repos exist with the source directory
  • It is easy to create patch files that can be mailed and applied.
  • Although it might seem that git is difficult to use, the basic ideas of committing, merging, adding and removing files are not that hard to learn.

Have a look at this site Git Magic or, even this tip site GitReady

Abizern
Git Magic site is awesome. I don't have enough rep to upvote your answer but will when rep arrives.One thing not mentioned at Git Magic is now to email patches to programming partner. Suggestion where to look?
Norman Ramsey
Having used Git I think in general it is the way to go, but for a student it may be benefitial for them to learn SVN because that's almost certainly what they will be asked to use in the real world.
Peter Coulton
Git format-patch prepares for email submission, and git am applies those patches.
Abizern
I don't like arguments about "the real world." They almost never last. I've not had to use svn in "the real world" (just cvs, p4, hg and git). I wouldn't wish the constraints of svn on anyone.
Dustin
Peter: your comment implies that someone who knows git but does not know subversion will have trouble picking up subversion quickly. Do you really think that this is so? I disagree, given that the basic operations are so similar. In my real-world experience, the biggest doesn't-know-VC problem has been people who've never used it, don't understand why anybody would use it, and thus object to using it at all.
Curt Sampson
@Peter Coulton: I have never used SVN, I used CVS and now use Mercurial... don't make assumptions based on your own experience. Also, the commands do not really matter, what matters is the principles.
Matthieu M.
+7  A: 

I see no reason for dealing with setting up the source control system. Review the terms for using e.g. google code and dive in.

A fellow CS student and I used it last year and it works great and the only precondition is an internet connection :-)

Kasper
in my class some students could use 5 of their 10 projects in just one class. I'm a bit uneasy. Remember: one project per pair; pairs change.
Norman Ramsey
+2  A: 

For real ease of use for your students, you could install a SVN server with autocommit turned on, shared using webdav. This way they can just mount their directory using WebDAV and will autcommit every time they hit save - accessing the history is easy with TortoiseSVN, the Eclipse / Visual Studio Plugins or some web access solution like ViewVC. For your access restriction needs you could use the integrated subversion authentication (look here) - which uses a simple configuration file for fine grained access control.

Configuration has become a lot easier (and there is better documention now - have a look at the SVN Book), but could get a bit coplicated if you need multiple separate repositories with access restrictions and a web interface.

Autocommit is more a solution for the "my office worker / boss" who has no clue whats going on inside a computer needs version control for word documents. Students taking a programming course should perhaps also learn how to use a decent SCM anyway.

Git and Mercurial would be nice because of their distributed nature, which makes sharing easy - but both tools lack GUI interfaces which are really easy to use (TortoiseHg looks promising, and gitk is a very good Repository browser, but your students would still have to wrap their heads around the command line tools to make full use of the tools). Also the concept of distributed SCM's is a little more complex to grasp.

On the pro side you could use public hosting solutions like GitHub and wouldn't have to worry about a server setup. This also makes sharing solutions really easy, but would break your "only with each other" requirement. But I guess you won't be able to stop them from exchanging code anyway, in my experience with course work I found looking at the code and verifying that it's unique is the only way to prevent copying.

You could also use PlasticSCM, which has really nice interfaces for a lot of IDE's and (at least the site claims) free licenses for educational institutions.

VolkA
Indeed, you're right that "Students taking a programming course should perhaps also learn how to use a decent SCM anyway." But even the office worker or boss can usually understand Subversion just fine; I've taught people that can't even use Excel in the most primitive way to use Subversion!
Curt Sampson
+2  A: 

If you are looking for something that is really really esay to set up, then why not go for the free SVN hosting option, you don't have to set up a thing!

Sadly the two older ones that everyone would have pointed you to being Assembla, Unfuddle, have dropped support for their free hosting ( or at lest if you want them to private ), but you can still use Origo this give you both open and closed hosting.

The advantage of this is that you can own all the projects and follow them all, and easily control the people who have access, and you don't have to worry about right for creating repos.

If you do go this route, and you want to eliminate complexity then you must use a GUI svn application to make learning near trivial ( since I doubt there will be much merging going on ). I would recommend tortoisesvn, slips right into your windows explorer context menu.

Stephen Bailey
+8  A: 

I would recommend Mercurial (also called 'hg'). It is a distributed open-source VCS, and needs no central repository. Using it day-to day is easy. There is enough documentation on official site. For example check out QuickStart.

Deciding point for me was a great GUI for Windows - TortoiseHg. It seems it is also supported on Linux (didn't try myself). And of course there are command-line distributions for most Linux versions.

Of course it seems easy from this side of the fence, maybe for busy students concept, advantages, and everyday operation won't be that easy to get used to. But in the end, instant commits, ability to revert to any revision and create a new branch from there automatically, and intelligent diff/merge are just irreplaceable.

Hope this helps!

Alexander Abramov
+34  A: 

Second the choice of Mercurial

Advantages

  • Excellent documentation
  • Graphical view command to show branching
  • TortoiseHG for Windows
  • Cross-platform
  • Built-in web server for viewing the project
  • Can keep your project on your thumbdrive
    • Work can be saved even if only one member of the pair remembered their laptop. Not that that would ever happen.

Disadvantages

  • Must install Python if not already present
    • Easy to do, but it is another step
  • Understanding the distinction between push/pull vs update/commit
    • (This is common to all distributed VCS)
  • The distinction between heads and tips
  • Some commands aren't immediately available; they must be explicitly enabled.
    • (This is arguably advantageous if you want to keep things simple)
Joel Spolsky has a very good introduction to Mercurial at http://hginit.com/top/. It is quick and easy to read and the pictures do a good job of illustrating the situation. For cooperating between pairs, there are sites offering free repositories, e.g. bitbucket.com for Mercurial.
Boris
+7  A: 

Bazaar, Mercurial, and Git sound appropriate for your case - trivial to create repositories, and all the students need to share is read access on the filesystem to each other's repositories.

orip
read access on the filessytem to each other's repo's requires intervention from sysadmins, and on unix a group per pair. This solution will not fly.
Norman Ramsey
Not if you don't mind everyone having read access to everyone's repository. The nice thing about these is that read access is enough to merge between their repositories.
orip
+5  A: 

I have had some very good experience with Bazaar. Like Git/Mercurial it is distributed. It is serverless - you do not need a daemon installed on the server hosting the repository, even if you are accessing it remotely (ie, it can work just as an FTP/SFTP share).

A distributed VCS is most flexible. You can check out a branch from a more traditional 'central' repository and gain all the benefit of being able to fork off your own little development separate to the central server, etc and then, perhaps, push your changes back up.

There are import tools for other VCSs such as Subversion though I haven't tried them.

thomasrutter
+3  A: 

Setting up a subversion repository is trivial; I frequently set one up as a one-off thing for small projects (such as developing code for an answer on Stack Overflow!), and I doubt anybody else who could learn an SCM system at all would have any trouble with it.

$ svnadmin create /home/cjs/repo
$ mkdir my-project
$ cd my-project
$ vi hello.c
  [...hack hack hack...]
$ svn import -m 'Initial project import.' file:///home/cjs/repo
Adding         hello.c

Committed revision 1.

That said, sharing is certainly an issue. If the students always work together when they work simultaneously, they could use a USB drive, since they can just unplug it and pass it back and forth when one needs to comit, and the person who's going to program alone later can just hang on to it. That's not entirely convenient, though.

Another option, since they all appear to be working on a shared Unix system, is to create a directory with the execute but not read bit set for the rest of the group (or all users) and use a s3cr3t name for the repo under that, one that only the two students know. Passing that secret name on to the prof would allow him to examine student's repos at any time, as well. ("So you submitted the assignment on time, but the e-mail system lost it? Let me just look at the time of that commit....") A script could help set this up.

In fact, the more I think about that, the more I'm beginning to like it. In certain ways, it's simpler than the git solution because the student doesn't have to deal with passing patches around (or forgetting to do so) and the student will be forced to deal with merges before he commits, rather than once things are in the repository (with the subsequent ability to delay dealing with that indefinitely).

Curt Sampson
+8  A: 

I'd suggest looking at Fossil - It's a single executable with no dependencies to run, operates all traffic over HTTP, keeps its all repository data in a single file which can be named anything, and includes version controlled wiki, bug tracking and a web-server out of the box. Oh, and it's completely distributed.

squeeks
was also going to mention Fossil. It may not be “better” (faster, more flexible, etc.) than other systems DVCS systems, but it certainly seems to come very close to its goal of being simple to use. The integrated wiki and bug-tracking systems might even come in handy for documenting the student's progress through their projects.
Chris Johnsen
Fossil is awesome.
Amigable Clark Kant
+1  A: 

I've used CVS, SVN, Bazaar and Git (in that order of introduction) and I'd have to say for students that SVN is the way to go. In fact, while I was lead TA we implemented SVN as a replacement for the old "submit script" which was a tar and email script. Labstaff setup an Apache SVN-DAV based repository and using the authz file the TAs and instructor could control permissions for per-student directories and group projects at a very fine-grained level leaving students with a very simple path to their first commit. See my tutorial (credentials ripped out by the most recent TAs.. hmm..)

Regarding the use of subversion without intervention by sys admins, I've done this as well in a group-project setting where none of my group members had ever used subversion before and most of them were committing with very little confusion (all but one). I also wrote a tutorial for setting up such a secure shared repository with only basic SSH access here.

I definitely disagree that git is the best VCS for beginners having experienced the blank looks enough at the mention of any VCS system, let alone the mac-daddy-written-by-Linus-himself VCS king, git. It is simply not true that git is no more complex than svn, and the lack of mature n00b tools alone is enough of a reason not to use it in this scenario. I just started using git for a new project that I am developing in Netbeans and already ran into serious limitations with the Netbeans integration. In a single semester you are not going to use any functionality that svn doesn't provide so git is overkill.

ColinM
A: 

How about good old RCS?

lhf
Luiz, RCS may be old, but there is nothing good about it. Even CVS is a major upgrade over RCS...
Norman Ramsey
Ok, good may be too much but I find RCS to be a small simple package that is easy to install and use, isn't it?
lhf
A: 

darcs send is trivial to setup - when you run darcs send <remote repo>, it looks in _darcs/prefs/email of the remote repo to decide where to sent the email to. If there's nothing there then it prompts the user instead.

The receiver of the patch just saves the file and runs darcs apply <patch file> in the appropriate repo.

So each student can just have their own repos with their own email address in _darcs/prefs/email and exchange patches by email.

Ganesh Sittampalam
I agree that `darcs` is easy to set up, but having been a victim of darcs FAIL, literally losing weeks of work, I would never inflict it on a student.
Norman Ramsey
What happened?Most of darcs' weaknesses arise with complicated merges or large source trees, so it should be pretty safe for student projects.
Ganesh Sittampalam
+2  A: 

Darcs is an excellent DVCS, especially for smaller projects such as ones for CS classes. I wish I was introduced to Darcs or Git in college, and I also commend you for introducing it to your students.

I use Git on a daily basis. It's a very robust DVCS, but maybe a bit of an overkill for smaller projects.

Take your pick, either of those version control systems are really good.

thekingoftruth
+1  A: 

Regarding permissions, an outside service wouldn't require time from your university's IT staff.

For example, Bitbucket (using Mercurial) now allows unlimited private repos with up to 5 users. I'm guessing each new weekly pair of students is working on a new project together, which means they can just initialize the repository, add the other user, and away they go.

If they are not working on a new project every week, permissions would have to be removed and added, and I'd encourage them to have multiple repos (one per account) on Bitbucket so each student has continued access. (This would be a good idea anyway, but for only week-long projects, it may be simpler to just have one student account own the repo and the other with permission.)

Regarding which VCS, I believe Mercurial will be best given your platforms—TortoiseHg being particularly good for new users to explore, if they're unfamiliar with (and you don't have time for them to learn) command-line interfaces.

Specific to your situation, the advantage of DVCS is their copy on the university server (if there is one) is a fully-fledged repo. You may find it convenient for you or TAs to have access, which should be simpler to setup and would last all semester instead of change weekly.

Roger Pate