views:

61

answers:

1

Here is the scenario:

I am currently working on a legacy reporting system for a company which collects data from sensors. The database (SQL-Server 2005) is fairly big (ca. 500 GB) with hundreds of MB’s being added every day. The system runs a series of nightly jobs in order to aggregate and transform the newly added data into a staging database from which the reports are issued. These jobs can take hours to complete and are written entirely in SQL, resulting in hundreds of stored procedures (some being up to 3000 lines long) containing tons of business logic. The system has been in ongoing development for almost three years, and has become a maintenance nightmare.

My dilemma:

We would love to rewrite parts of the system (most urgently the nightly jobs) on a platform better suited for expressing and maintaining complex business logic (aka C#). The problem is that we frequently need to debug the data collection process in order to detect bugs in the system and (more often than not) inconsistencies in the source data. The only place where we can detect these issues is the production environment. The big advantage of the current implementation in SQL is that we can just copy/paste parts of the scripts to analyze the outcome of each intermediate step. This would hardly be possible with a C# solution, since we cannot install visual studio on the production environment.

Finally, the question:

Do you have any good suggestion of how to maintain our flexible debug capabilities on the production environment AND find a way of expressing complex business logic in a sane way?

+2  A: 

In environments like the one you describe you have to mirror the production environment on a server that can be debugged until you get the code base refactored and supported with test cases.

You have no hope of ever having a maintainable system with business logic in stored procedures. Been there. Total nightmare.

These two books are helpful for refactoring large systems like you describe: Working Effectively with Legacy Code by Michael Feathers and xUnit Test Patterns: Refactoring Test Code by Gerard Meszaros.

Also, sounds like you're doing a lot of Extract-Transform-Load (ETL). There are a number of excellent environments that allow you to write ETL transformations in C#. ETL steps are highly testable.

Rob
+1 for "Been there. Total nightmare."
womp
@Jim - could you recommend a good C# ETL tool? I've been using SSIS for years now, and have found that most of the really useful transformations end up requiring script components anyway. Any thoughts?
Mark