views:

600

answers:

12

I am looking for an encrypted version control system . Basically I would like to

  • Have all files encrypted locally before sending to the server. The server should never receive any file or data unencrypted.

  • Every other feature should work pretty much the same way as SVN or CVS does today.

Can anyone recommend something like this? I did a lot of searches but I cant find anything.

+9  A: 

Why not set up your repo (subversion, mercurial, whatever) on an encrypted filesystem, and use ssh only to connect?

John Weldon
If the server was compromised, wouldn't a potential attacker be able to read the data when a users session has connected via SSH? I am under the impression the data would exist unencrypted in RAM.
Mike
If the server was compromised you'd be screwed anyway.
John Weldon
If the data was encrypted and an properly backed up a compromised server would not lead to any loss of sensitive data .
Mike
Theoretically that's what an encrypted filesystem would do.
John Weldon
+28  A: 

You should encrypt the data pipe (ssl/ssh) instead, and secure the access to the server. Encrypting the data would force SVN to essentially treat everything as a binary file. It can't do any diff, so it can't store deltas. This defeats the purpose of a delta-based approach.
Your repository would get huge, very quickly. If you upload a file that's 100kb and then change 1 byte and checkin again, do that 8 more times (10 revs total), the repository would be storing 10 * 100kb, instead of 100kb + 9 little deltas (let's call it 101kb).

Update: @TheRook explains that it is possible to do deltas with encrypted repository. So it may be possible to do this. However, my initial advice stands: it's not worth the hassle, and you're better off with encrypting the ssl/ssh pipe and securing the server. i.e. "best practices".

Chris Thornton
I was thinking the deltas could be computed client side and sent back to the server encrypted.
Mike
@Mike - but 1000 revs later, now you want to do a checkout of the current. The server has to give you the original, and 1000 deltas, and let the client re-assemble them. Sounds like a mess. And the server has no idea whether things are consistent or messed up.
Chris Thornton
@Mike: In which case you don't have a VCS, you've got a remote backup.
David Thornley
@David - exactly. Might as well use ZIP files (AES256) and ftp.
Chris Thornton
Actually this answer is incorrect, please read my post.
Rook
+2  A: 

Have a look at GIT. It supports various hooks that might do the job. See, http://stackoverflow.com/questions/2456954/git-encrypt-decrypt-remote-repository-files-while-push-pull.

Maksym Bykovskyy
+4  A: 

What specifically are you trying to guard against?

Use Subversion ssh or https for the repo access. Use an encrypted filesystem on the clients.

msemack
+1  A: 

You could use a Tahoe-LAFS grid to store your files. Since Tahoe is designed as a distributed file system, not a versioning system, you'd probably need to use another versioning scheme on top of the file system.

Edit: Here's a prototype extension to use Tahoe-LAFS as the backend storage for Mecurial.

Commodore Jaeger
+5  A: 

SVN have built-in support for transferring data securely. If you use svnserve, then you can access it securely using ssh. Alternatively you can access it through the apache server using https. This is detailed in the svn documentation.

Claus Broch
A: 

Source safe stores data in Encrypted files. Wait, I take that back. They're obfuscated. And there's no other security other than a front door to the obfuscation.

C'mon, it's monday.

FastAl
A: 

Based on my understanding this cannot be done, because in SVN and other versioning systems, the server needs access to the plaintext in order to perform versioning.

Jus12
+4  A: 

It is possible to create a version control system of cipher text. It would be ideal to use a stream cipher such as RC4-drop1024 or AES-OFB mode. As long as the same key+iv is used for each revision. This will mean that the same PRNG stream will be generated each time and then XOR'ed with the code. If any individual byte is different, then you have a mismatch and the cipher text its self will be merged normally.

A block cipher in ECB mode could also be used, where the smallest mismatch would be 1 block in size, so it would be ideal to use small blocks. CBC mode on the other hand can produce widely different cipher text for each revision and thus is undesirable.

I recognize that this isn't very secure, OFB and ECB modes shouldn't normally be used as they are weaker than CBC mode. The sacrifice of the IV is also undesirable. Further more it isn't clear what attack is being defended against. Where as using something like SVN+HTTPS is very common and also secure. My post is merely stating that it is possible to do this efficiently.

Rook
Good answer. ||
Joshua
@Joshua thank you.
Rook
Nice idea. However: Reusing a stream cipher (like RC4-drop1024) with the same key+iv doesn't just make it weak - it makes it basically worthless: Just add any two of the resulting cipher texts bitwise, and the encryption stream is 100% canceled out.
Chris Lercher
@chris_l correct the same is also true for a block cipher in OFB mode, but not ECB mode. However, the attacker would still have to know some plain text, which is possible based on the predictive nature of code, such as heavy use of semi-colons and parrens. The bigger issue at play is that it isn't clear who the attacker is, if its someone sniffing the line then you need transport layer protection.
Rook
+3  A: 

Use rsyncrypto to encrypt files from your plaintext directory to your encrypted directory, and decrypt files from your encrypted directory and your plaintext directory, using keys that you keep locally.

Use your favorite version control system (or any other version control system -- svn, git, mercurial, whatever) to synchronize between your encrypted directory and the remote host.

The rsyncrypto implementation you can download now from Sourceforge not only handles changes in bytes, but also insertions and deletions. It implements an approach very similar to the approach that that "The Rook" mentions.

Single-byte insertions, deletions, and changes in a plaintext file typically cause rsyncrypto to make a few K of completely different encrypted text around the corresponding point in the encrypted version of that file.

Chris Thornton points out that ssh is one good solution; rsyncrypto is a very different solution. The tradeoff is:

  • using rsyncrypto requires transferring a few K for each "trivial" change rather than the half-a-K it would take using ssh (or on a unencrypted system). So ssh is slightly faster and requires slightly less "diff" storage than rsyncrypto.
  • using ssh and a standard VCS requires the server to (at least temporarily) have the keys and decrypt the files. With rsyncrypto, all encryption keys never leave the local computer. So rsyncrypto is slightly more secure.
David Cary
Interesting idea. Clients must all use the same keys I assume. Probably doesn't cope well with any compromise of clients' keys.
Craig McQueen
@Craig McQueen, is there any alternative that does cope well with the compromise of a client's key to the VCS archive?Using one key for each VCS archive shared among all users sounds bad -- the bad guys only need to compromise any one client of their choosing, and then they can read all the data in the VCS archive.But giving each client a unique key, as in the ssh approach, is no better -- the bad guys only need to compromise any one client of their choosing, and then they can read all the data in the VCS archive.
David Cary
Well, any key compromise gives your data to the bad guys. I was thinking more about the cost of revoking a key. If only the data channel is encrypted, then you can assign individual keys. If you no longer trust one user, you can just revoke their key, with no impact on other users. But if there is a single key, a new key must be rolled out to all other users. If the data itself is encrypted with a key that (by necessity) all the clients must know, then the only way to revoke the key is to re-encrypt the entire repository and roll the new key out to all clients.
Craig McQueen
I gave you a better solution to the server compromise problem.I was hoping you would give me a better solution to the client compromise problem, but perhaps there isn't one.Yes, there is a tradeoff. In the case where a client is compromised, the ssh approach is better. In the case where a server is compromised, the rsyncrypto approach is much better.
David Cary
+1  A: 

Variant A

Use a distributed VCS and transport changes direct between different clients over encrypted links. In this scenario is no attackable central server.

For example with mercurial you can use the internal web server, and connect the clients via VPN. Or you can bundle the change sets and distribute them with encrypted mails.

Variant B

You can export an encrypted hard drive partition over the network and mount it on the client side, and run the VCS on the client side. But this approach has lot's of problems, like:

  • possible data loss when two different clients try to access the VCS at the same time
  • the link itself must be secured against fraudulent write access (when the partition is shared via NFS it is very likely to end with a configuration where anyone can write to the shared partition, so even when there is no way for others to read the content, there is easily a hole to destroy the content)

There might be also other problems with variant B, so forget variant B.

Variant C

Like @Commodore Jaeger wrote, use a VCS on top of an encryption-aware network file system like AFS.

Rudi
A: 

Have you thought of using Duplicity? It's like rdiff-backup (delta backups) but encrypted? Not really version control - but maybe you want all the cool diff stuff but don't want to work with anyone else. Or, just push/pull from a local Truecrypt archive and rsync it to a remote location. rsync.net has a nice description of both - http://www.rsync.net/resources/howto/duplicity.html http://www.rsync.net/products/encrypted.html - apparently Truecrypt containers still rsync well.