views:

289

answers:

8

I'm about to inherit a set of large and complex set of stored procedures that do monthly processing on very large sets of data.

We are in the process of debugging them so they match the original process which was written in VB6. The reason they decided to re write them in t-sql is because the vb process takes days and this new process takes hours.

All this is fine, but how can I make these now massive chunks of t-sql code(1.5k+ lines) even remotely readable / maintainable.

Any experience making t-sql not much of head ache is very welcome.

+4  A: 

First, create a directory full of .sql files and maintain them there. Add this set of .sql files to a revision control system. SVN works well. Have a tool that loads these into your database, overwriting any existing ones.

Have a testing database, and baseline reports showing what the output of the monthly processing should look like. Your tests should also be in the form of .sql files under version control.

You can now refactor your procs as much as you like, and run your tests afterward to confirm correct function.

Jeff Paulsen
A: 

One thing that you can do is have an automated script to store all changes to source control so that you can review changes to the procedures (using a diff on the previous and current versions)

hamishmcn
+1  A: 

For formatting/pretty-fying SQL, I've had success with http://www.sqlinform.com/ - free online version you can try out, and a desktop version available too.

SQLinForm is an automatic SQL code formatter for all major databases ( ORACLE, SQL Server, DB2 / UDB, Sybase, Informix, PostgreSQL, MySQL etc) with many formatting options.

micahwittman
+2  A: 

Definately start by reformatting the code, especially indentations.

Then modularise the SQL. Pull out chunks into smaller, descriptively named procedures and functions in their own stand alone files. This alone I find works very well with improving my understanding of large SQL files.

jacko
+1  A: 

Try to modularise the SQL as much as possible and have a set of tests which will enable you to maintain, refactor and add features when needed. I once had the pleasure of inheriting a Stored Proc that totalled 5000 lines and I still have nightmares about it. Once the project was over I printed out the stored proc for a laugh destorying X trees in the process. During one of our companies weekly stand up sessions I laid it out end to end and it streched the entire length of the building. Ised this as an example of how not to write and maintain stored procedures.

A: 

Thanks for all the suggestions.

I'll be putting it into svn and i'll give that formatting tool a look for sure.

Hath
+1  A: 

ApexSQLScript is a great tool for scripting out an entire database - you can then check that into source control and manage changes.

I've also found that documenting the sprocs consistently lets you pull out information about them using the data about the source code in sys.sql_modules - you can use tags or whatever to help document subsystems.

Also, use Schemas (or even multiple databases) - this will really help divide up your database into logical units and point out architectural issues.

As far as large code, I've recently found the SQL2005 CTE feature to be very useful in managing code with lots of nested queries (not even recursive). Instead of managing a bunch of nesting and indentation, CTEs can be declared and built up and then used in the final statement. This also helps in refactoring as it seems a lot easier to remove redundant nested queries and columns.

Stored Procs and UDFs are vital for managing a large code base and eliminating dark corners. I have not found views to be terribly helpful because they are not parameterizable (UDFs can be used in these cases if the result sets are small).

Cade Roux
A: 

It's definitely not free, but for keeping your T-SQL formatted in a consistent way, Redgate Software's SQL Prompt is very handy. As long as your proc's syntax is correct, a couple of keystrokes (Ctrl+K,Y) will reformat it all instantly. The options give you a lot of control over how your SQL is formatted.

AR