views:

21

answers:

0

Currently I am writing SQL for a handcoded ETL(extract transform load) process. As I am trudging along I writing documentation for the system. It will not only be helpful in the future but it's really useful to me to document high level process flows and batch jobs so that I make sure the details of my implementation or supporting these.

I have been using Excel mostly to document the data dicitonary, the logical mapping of the source system to the destination system, and now I even have some state diagrams that show various scenarios and how the ETL process would handle a particular example set of rows.

As I document the process flow, I want to be able to link to *.sql files containing DDL or stored procedures, so that someone can click the link and it will open the file. The problem I am encountering is I feel it necessary to add alot of documentation in the source code itself so that someone maintaining it will understand how everything fits together. However I also don't want someone to have to dig through source code at random trying to understand how the system works, so I find myself duplicating alot of documentation in the Excel files.

I'd like the documentation to be separate from the source code. The documentation to contain all the high level overview and technical description of the stored procedures and how they fit together. I feel the only comments in the source code itself of the stored procedures should be very specific such as:

  WHERE R.SomeField LIKE '_%-%'  --a hyphen is anywhere but the beginning

Otherwise, comments about how the stored procedure interacts with other procedures and how it fits into the ETL process should not be in the source code comments, but should be in the documentation.

How can I associate the documentation with the source code? If I am talking about how sp_A interacts with sp_B to perform some function of the ETL process, I want to link to those source code files. Additionally I want to, in one or two short lines, reference the documentation from the source code. This would allow me to omit redundant high level documentation from the source code, since someone digging through source code can find relevant documentation, and someone digging through the documentation can find relevant source code.

I want to come up with something that would also work well for .NET source code and application documentation.

I have looked at documentation generators, and that kind of documentation really makes up a fraction of what the documentation will be. It also is redundant since most of the stuff, table definitions, DDL, etc. is simply things you can discover easily in SSMS.

The generators for .NET code aren't really interesting either because the documentation is structured the way the classes are structured. It only represents encapsulation and inheritance. This is great for a framework library like the .NET Framework. However it is pretty useless for a system or application, as it doesn't represent processes flows, use cases, and interactions between classes which I find to be more important when documenting an application.

My brainstormed solution: I really liked doing documentation in mediawiki in the past. I love being able to easily link articles together and create pages on the fly, or link to nonexisting articles so that I have a reminder to write that later. If I could also add links to source code that would be awesome. And then in the source code I can notate the URLs of relevant articles regarding how it fits into the overall system. I feel linking between articles is important because it allows you to turn a phrase into a link, and not go off on a tangent writing something, knowing if your reader doens't understand, then they can traverse that link and then come back. Otherwise you end up writing documentation that goes off on a hundred tangents before it can really begin to say what it needs to say.

The big downsides I see with this approach will be that mediawiki is PHP and that goes against the grain of this being a purely windows shop. So that is a political challenge if I choose that route as they might be opposed to putting the required components on a server to support mediawiki. Secondly it is difficult to convert an entire mediawiki site into a monumental document that can be printed and/or stand on it's own(without a web site serving it up). Where as an excel file can be printed and/or passed around as a stand alone document with no supporting server.

How can I reliably link to the source code? Our source code is currently not source controlled and just sits on a network drive gasp To rollback a change involves driving 15 minutes to a vault, pulling a tape backup, coming back, and restoring the file. So I could link to source code via a UNC path, but I'd hate for all my links to break on account of a server rename or migration of some sort.

However, I would like to hear "if the source code were source controlled" hypotheses, how would you link to source controlled source code? I am primarily thinking of TFS because it will be included in upcoming VS2010 MSDN subscriptions and it has alot of project management related features I hope to "sell" the management on.