tags:

views:

4057

answers:

6

I want to keep my website/s in version control (Subversion specifically) and use svn co to update it when there are stable versions to update, but I'm concerned about the security of doing so, as all the .svn folders will be public, and these include all sorts of private data, not least of which is complete source code to my website!

Is there anything I can I do to prevent this?

+3  A: 

This can be achieved server-wide (recommended), on a single virtual-host basis, or even inside .htaccess files if your server is somewhat permissive with what is allowed in them. The specific configuration you need is:

RewriteEngine On
RewriteRule /\.svn /some-non-existant-404-causing-page

<IfModule autoindex_module>
    IndexIgnore .svn
</IfModule>

The first section requires mod_rewrite. It forces any requests with "/.svn" in them (ie. any request for the directory, or anything inside the directory) to be internally redirected to a non-existant page on your website. This is completely transparent to the end-user and undetectable. It also forces a 404 error, as if your .svn folders just disappeared.

The second section is purely cosmetic, and will hide the .svn folders from the autoindex module if it is activated. This is a good idea too, just to keep curious souls from getting any ideas.

Matthew Scharley
+18  A: 

Two things:

  1. Do not use IfModule for functionality you need to be present. It's okay to do it for the autoindex because it might not be present and is not crucial to the scheme. But you are counting on rewrite being present to protect your content. Thus, it's better to remove the IfModule directive and let apache tell you when rewrite is not present for you to enable it (or at least know that you won't be 'protected' and consciously comment the lines)

  2. No need to use rewrite there if you have access to main configuration files, much easier would be one of

    <DirectoryMatch \.svn>
       Order allow, deny
       Deny from all
    </DirectoryMatch>
    

which will generate 403 Forbidden (which is better from HTTP compliance point of view) or, if you want to take the security by obscurity route, use AliasMatch

    AliasMatch \.svn /non-existant-page

If you don't have access to main configuration files you're left with hoping mod_rewrite is enabled for usage in .htaccess.

Vinko Vrsalovic
I understand the point about 403. Personally (in this sort of situation, where the files should never be served), I'd much rather they didn't know the directories even existed though. I'll try out the AliasMatch too. This was what I managed to glean from Google, I'm always happy for better solutions
Matthew Scharley
note that this doesn't work in .htaccess files if you're trying to do this on a shared/virtual host. Use Monoxide's answer if you don't have access to the httpd.conf file. http://stackoverflow.com/questions/214886/#214887
nickf
True. There is a possibility for mod_rewrite not being enabled for use in .htaccess files as well. (Indicated in if your server is somewhat permissive with what is allowed in them)
Vinko Vrsalovic
You can use this solution with RedirectMatch also, which produces much the same results, and is available to .htaccess files. It's a little more convoluted solution, but it does work too if you don't have mod_rewrite access.
Matthew Scharley
+2  A: 

Hiding the directories as Vinko says should work. But it would probably be simpler to use svn export instead of svn co. This should not generate the .svn directories.

Eric Hogue
Simpler perhaps, but then you don't get the advantage of incremental updates. If I modify one image and two of my scripts, do I really need to pull down MB's (and have the site in maintenance mode while doing it) instead of the few seconds for an checkout?
Matthew Scharley
See the discussion at http://www.techcrunch.com/2009/09/23/basic-flaw-reveals-source-code-to-3300-popular-websites/ (a comment there is pointing back to this post). With the caveat you mention (huge site), you can get the "svn export" result by "svn co" in another directory and rsync with excludes, just as CesarB suggests in another answer.
Olaf
+3  A: 

There is an interesting approach I use: the checkout (and update) is done on a completely separate directory (possibly on a completely separate machine), and then the code is copied to where the webserver will read it with rsync. An --exclude rule on the rsync command line is used to make it not copy any .svn (and CVS) diretories, while a --delete-excluded makes sure they will be removed even if they were copied before.

Since both svn update and rsync do incremental transfers, this is quite fast even for larger sites. It also allows you to have your repository behind a firewall. The only caveat is that you must move all directories with files generated on the server (such as the files/ directory on Drupal) to a place outside the rsync target directory (rsync will overwrite everything when used this way), and the symlink to it must be created in the rsync source directory. The rsync source directory can have other non-versioned files too (like machine-specific configuration files).

The full set of rsync parameters I use is

rsync -vv --rsh='ssh -l username' -rltzpy --exclude .svn/ --exclude CVS/ --exclude Attic/ --delete-after --delete-excluded --chmod=og-w,Fa-x

Even then, for redundancy, I still have a configuration rule to prevent .svn from being accessed, copied from a Debian default rule which prevents .ht* (.htaccess, .htpasswd) from being accesed.

CesarB
I prefer the AliasMatch example over the built-in examples for blocking access to source control directories... It's not just a minor security breach (Seeing settings), it's a major one if anyone ever managed to get inside them, so 404 seems appropriate. (it's not here, go look elsewhere)
Matthew Scharley
+3  A: 

Consider deploying live code using your operating system's package management tools, rather than directly from your VCS. This will let you ensure your live packages don't contain metadata directories, or other potentially sensitive tools and data.

Jon Topper
+1 for a good idea, but I use this on development machines too, just to help them keep looking clean when/if I happen to be browsing around directory structures and such.
Matthew Scharley
A: 

In the same situation, I used RedirectMatch, for two reasons. Primarily, it was the only method I could find that was allowed in .htaccess on that server with a fairly restrictive config that I couldn't modify. Also I consider it cleanest, because it allows me to tell Apache that yes, there's a file there, but just pretend it's not when serving, so return 404 (as opposed to 403 which would expose things that website viewers shouldn't be aware of).

I now consider the following as a standard part of my .htaccess files:

## Completely hide some files and directories.
RedirectMatch 404 "(?:.*)/(?:[.#].*)$"
RedirectMatch 404 "(?:.*)~$"
RedirectMatch 404 "(?:.*)/(?:CVS|RCS|_darcs)(?:/.*)?$"
Gilles