What's the best practice for handling system-specific information under version control?

views:

392

answers:

+6 Q:

What's the best practice for handling system-specific information under version control?

I'm new to version control, so I apologize if there is a well-known solution to this. For this problem in particular, I'm using git, but I'm curious about how to deal with this for all version control systems.

I'm developing a web application on a development server. I have defined the absolute path name to the web application (not the document root) in two places. On the production server, this path is different. I'm confused about how to deal with this.

I could either:

Reconfigure the development server to share the same path as the production
Edit the two occurrences each time production is updated.

I don't like #1 because I'd rather keep the application flexible for any future changes. I don't like #2 because if I start developing on a second development server with a third path, I would have to change this for every commit and update.

What is the best way to handle this? I thought of:

Using custom keywords and variable expansion (such as setting the property $PATH$ in the version control properties and having it expanded in all the files). Git doesn't support this because it would be a huge performance hit.
Using post-update and pre-commit hooks. Possibly the likely solution for git, but every time I looked at the status, it would report the two files as being changed. Not really clean.
Pulling the path from a config file outside of version control. Then I would have to have the config file in the same location on all servers. Might as well just have the same path to begin with.

Is there an easy way to deal with this? Am I over thinking it?

I like the way Ruby on Rails deals with this sort of issue - environment-specific configuration files. Rails supports development, test, and production database connections - controlled by configuration in the database.yml file. Here is a blog post about creating other environment-specific configuration options, it is for Rails but might give you some ideas about how to do something similar for your environment. http://usablewebapps.com/2008/09/yaml-and-custom-config-for-rails-projects/

cnk 2009-01-07 06:07:40

Sounds like your production code is a full on git repository and to update production you do a git pull? You might want to try a separate build process that checks the code out of your repository and creates a clean build (no .git folder). You could could have environment specific config files which contain your paths that are copied or created along with it.

timdisney 2009-01-07 06:38:53

+9 A:

Do not EVER hard-code configuration data like file system paths and force multiple deployments to match. That is the dark side, where there is much SUFFERING.

I find it useful and easy to build my systems to support multiple configurations easily, and I routinely commit configuration files into source control side-by-side, but production's is obfuscated (no real passwords) and development's is templated (so a checkout can't overwrite a developer's configuration). The code is always packaged in a configuration-neutral manner--the same binary can be deployed anywhere.

Unfortunately, most language/development platforms do not readily support this (unlike Ruby on Rails). Therefore, you have to build it yourself, to varying degrees.

In general, the basic principle is to incorporate indirection into your configuration: specify not the configuration, but how to find the configuration, in your code. And generally invoke several indirections: user-specific, application-specific, machine-specific, environment-specific. Each should be found in a well-defined place/manner, and there should be a very-well-defined precedence among them (usually user over machine over application over environment). You will generally find that every configurable setting has a natural home in one location, but don't hard-code that dependency into your applications.

I find that it is VERY valuable to design applications to be able to report their configuration, and to verify it. In most cases, a missing or invalid configuration item should result in aborting the application. As much as possible, perform that verification (and abort) at startup = fail fast. Hard-code defaults only when they can reliably be used.

Abstract the configuration access so that most of the application has no idea where it comes from or how it is processed. I prefer to create Config classes that expose configurable settings as individual properties (strongly typed when relevant), then I "inject" them into application classes via IOC. Do not make all your application classes directly invoke the raw configuration framework of your chosen platform; abstraction is your friend.

In most enterprise-class (Fortune 500) organizations, no one sees the production (or even test) environment configurations except the admin team for that environment. Configuration files are never deployed in a release, they are hand-edited by the admin team. The relevant configuration files certainly never get checked into source control side-by-side with the code. The admin team may use source control, but it is their own private repository. Sarbanes-Oxley and similar regulations also tend to strictly forbid developers from having general access to (near-)production systems or any sensitive configuration data. Be mindful as you design your approach.

Enjoy.

Rob Williams 2009-01-07 06:44:24

Even if I disagree with your comment on my answer, I find yours informative. +1

VonC 2009-01-07 21:24:33

+2 A:

Avoid absolute paths wherever possible.

Don't rely on your current version control to do something magic - you may change version control systems in the future.

The simplest approach works for me: have a 'config.live' and the 'config' is configured for development. During deployment simply move the config.live to config and all is fine. For more complex configurations a sub-directory for each configuration may be required.

A set of deployment procedures is essential - as the configuration is only one area that will be different.

Anything more complex is almost certainly likely to cause more problems than it solves.

Richard Harrison 2009-01-07 06:55:36

+2 A:

You should always separate historization (what a Source Control is for) from deployment.

A deployment involves:

an identified set of data (for which a tag or label provided by the SCM comes in handy)
a process manipulating those data (for at least copying them at the right place, but also expanding some compressed files, and so on...)

Amongst the various operation a deployment does, you should include a de-variabilization phase.

A variable is a keyword representing anything likely to change depending on your deployment platform (which can be a PC for continuous integration, a linux for basic homologation, an old Solaris8 for pre-production homologation, and a Full F15K Solaris10 with zones for production: it short it can varies a lot). See Jonathan Leffler's answer for practical examples.

A variable can represent a path, a JVM version, some JVM settings and so on, and what you are putting in an SCM should be a data with variables in it, never hard-coded settings.

The next step would be to include in your executable a way to detect any change in a setting files in order to update while running some parameters (avoiding the the all "shutdown / change settings / restart" sequence).
That means they are two types of deployment variables:

static ones (which will never change),
dynamic ones (which should be ideally taken into account during the runtime session)

VonC 2009-01-07 07:27:58

See comments on Jonathan Leffler's answer. Changing configuration during execution is dangerous (interactions with other context), but might be necessary on RARE occasions.

Rob Williams 2009-01-07 21:16:44

I beg to differ. In your production environment may be. Not in ours. And the occasions are everything but rare, when you have a big farm of servers to fine tune. Please consider you may not have seen every production scenario in your career. I know I have not.

VonC 2009-01-07 21:21:56

Rob Williams: "computer consultant with about thirty years experience across a wide variety of technologies, roles, and industries"... ok, on second thought, you may have seen it all, actually ;) And my answer may not be that well-written.

VonC 2009-01-07 21:29:18

@VonC: I agree: there are exceptions which don't seem exceptional to those of us that have roamed the prairie. I am trying to be sensitive to the fact that most of our readers will probably never see a massive server farm. Hopefully, when they do, they will know how to make the exception.

Rob Williams 2009-01-07 22:54:40

+1 A:

Use an SCM such as Git for version control and a deployment tool such as Capistrano for deployment. Although Capistrano was initially created for Ruby on Rails it's perfectly fine to use it for other frameworks and languages.

The main thing is that a specific deployment tool will give you all flexibility to automate things like paths on both ends.

allesklar 2009-01-07 08:17:16

ansaurus

tags:

views:

answers:

What's the best practice for handling system-specific information under version control?

related questions