It seems that the company that I work for is always struggling with our customers’ server environments.
Specifically, we almost always encounter problems with testing servers and production servers, and the fact that they always seem to be configured differently. When we test the applications that we develop, the testing servers behave in one way, and thus we tweak and configure our applications to fit that particular behavior. But when we install the same application on the production servers we observe another behavior that is not consistent with the testing servers, thus rendering our tweaks and configurations useless. The most frustrating part is that this happens all the time and that no one seem to know what to do about it.
Of course we have a general idea of why this happens. Every cloned environment starts out the same and works the same the first couple of days, but sooner or later someone reconfigure something in only one of the server environments (be it a database update, an update of a component library, a web file update, or other configurations), thereby leading to discrepancy. And as time goes by, more and more discrepancies builds up. But the question is: what can we do about it?
I’ve tried searching the web but can’t find any good answers on what to do. I’ve also tried to figure out some solutions on my own, but most of my ideas seem to be problematic in some way. New routines, no matter how rigorous, can be circumvented. Regular cloning of the production servers to create testing servers is a tedious and often very slow process. Automatic replication is not always reliable or even possible. So what on Earth should we do about this problem? How can we guarantee that the experience when testing will match the experience when going live?
I imagine that others have this very problem as well. Or do they? Maybe it's just my particular company that is incompetent? Have any of you encountered the problem? If so, what did you do about it?
Sincerely,
Linus, Swedish systems developer