views:

68

answers:

3

I have a large software system with millions of SLOC, hundreds of modules, and thousands of interface dependencies. Based upon an earlier question in StackOverflow I have been able to start discovering what these interface dependencies actually are.

The challenge now is to have all this information available in a useful format. The data is in a SQL database so building a report is easy, but I need a way to actually model the data that is easy for the user to find what they are looking for.

I tried the standard solutions like UML, but there ends up being so many dependency lines that the diagrams look like dense spiderwebs and are useless. Right now I just have a 40,000-line Excel spreadsheet but that is not very practical.

Does anyone have any ideas or examples on how to manage this much specialized data? I've thought about trying to hack doxygen (I like javadoc-style output) but that seems like a lot of work.

A: 

If it is a well factored system, then there should clusters of interfaces which relate to each other within sub-systems, but only a few interfaces which are between sub-systems.

If it isn't a well factored system, then it isn't going to look pretty in any representation, and representations which eliminate links which are there will misrepresent the situation.

One option is to prune interfaces which only have one dependent, which will be the leaves of the graph. Doing that repeatedly will erode the system to a skeleton which has the most strongly linked nodes.

You also might want to perform a topological sort, which will show any cycles, and tell you where the layers are.

I don't favour JavaDoc for an overview of 40,000 interfaces - JavaDoc is good for looking things up in a hierarchically arranged library, but it doesn't show connections between things at all well.

Pete Kirkham
A: 

I think that there are some things to do before you tackle the technical question "in what technology do I create the documentation".

The true knowledge and understanding of the system is what lies beyond the actual interface relationships and module structures. It is the understanding of the whole system and how the individual parts in it contribute to the whole.

I would go in the following directions:

1) First, Try to understand the system top-down. This means to first understand the structure of the modules and create some representation of them, from the top down. During this process, you will probably find additional metadata on the modules that does not exist in your current excels. Take the time to add it, it will be most useful when you later create the automatic documentation, because it will reflect the "non-obvious" knowledge on the system structure.

2) Write a simple program that will generate a set of HTML files from the excel. That will help you browse and navigate the information more easily as a starting point for further investigation. I would not go into a fully fledged javadoc format in the beginning. Start small and evolve your program\script in stages, as needs arise. During this process you would also discover where refactoring would make sense.

3) Use the output of your HTMLs to research structure of several modules and reach understanding of internal patterns of interfaces. Are there naming conventions? Repeated patterns? Anything that you can deduce and is not obviously documented already in the excel.

I would create some local UML diagrams, but not of a size that will get out of hand - perhaps several UMLs per module. Mark dependencies to external modules in a distinct way. (Again, generation of automated UMLs will not be as useful, it is the hand-picking of meaningful interfaces in each diagram that will make the most enlightening UMLs in the documentation.)

I think the final result of a set of HTMLs and UMLs will be a good final result.

Eye of the Storm
A: 

Now that VSTS 2010 beta 1 is out, it might be a good time to watch the video "Bottom-up" Design with Visual Studio Team System 2010 Architect.

You might even want to try some of this out with the beta. It ships as a VM, so there's no danger to your systems. Also, you can use the Architecture tools without committing to the platform, as you're just trying to visualize your code, not develop more of it.

John Saunders