views:

2248

answers:

7

(Note: I realize this is close to http://stackoverflow.com/questions/186392/how-do-you-document-your-database-structure , but I don't think it's identical.)

I've started work at a place with a database with literally hundreds of tables and views, all with cryptic names with very few vowels, and no documentation. They also don't allow gratuitous changes to the database schema, nor can I touch any database except the test one on my own machine (which gets blown away and recreated regularly), so I can't add comments that would help anybody.

I tried using "Toad" to create an ER diagram, but after leaving it running for 48 hours straight it still hadn't produced anything visible and I needed my computer back. I was talking to some other recent hires and we all suggested that whenever we've puzzled out what a particular table or what some of its columns means, we should update it in the developers wiki.

So what's a good way to do this? Just list tables/views and their columns and fill them in as we go? The basic tools I've got to hand are Toad, Oracle's "SQL Developer", MS Office, and Visio.

+1  A: 

Since you have the luxury of working with fellow developers that are in the same boat, I would suggest asking them what they feel would convey the needed information, most easily. My company has over 100 tables, and my boss gave me an ERD for a specific set tables that all connect. So also, you might want to try breaking 1 massive ERD into a bunch of smaller, manageable, ERDs.

+1  A: 

One thing to consider is the COMMENT facility built into the DBMS. If you put comments on all of the tables and all of the columns in the DBMS itself, then your documentation will be inside the database system.

Using the COMMENT facility does not make any changes to the schema itself, it only adds data to the USER_TAB_COMMENTS catalog table.

Steven Huwig
Like I said in the original question, I can't change the schema, and I can't change the scripts that maintain the schema. So I can't add comments to anything but the test database on my own computer, and that regularly gets blown away and recreated.
Paul Tomblin
You're not changing the schema when you add comments. But I appreciate the situation with stupid database bureaucracy.
Steven Huwig
+4  A: 

We use Enterprise Architect for our DB definitions. We include stored procedures, triggers, and all table definitions defined in UML. The three brilliant features of the program are:

  1. Import UML Diagrams from an ODBC Connection.
  2. Generate SQL Scripts (DDL) for the entire DB at once
  3. Generate Custom Templated Documentation of your DB.

You can edit your class / table definitions within the UML tool, and generate a fully descriptive with pictures included document. The autogenerated document can be in multiple formats including MSWord. We have just less than 100 tables in our schema, and it's quite managable.

I've never been more impressed with any other tool in my 10+ years as a developer. EA supports Oracle, MySQL, SQL Server (multiple versions), PostGreSQL, Interbase, DB2, and Access in one fell swoop. Any time I've had problems, their forums have answered my problems promptly. Highly recommended!!

When DB changes come in, we make then in EA, generate the SQL, and check it into our version control (svn). We use Hudson for building, and it auto-builds the database from scripts when it sees you've modified the checked-in sql.

(Mostly stolen from another answer of mine)

Kieveli
+1  A: 

A wiki solution supports hyperlinks and collaborative editing, but a wiki is only as good as the people who keep it organized and up to date. You need someone to take ownership of the document project, regardless of what tool you use. That person may involve other knowledgeable people to fill in the details, but one person should be responsible for organizing the information.

If you can't use a tool to generate an ERD by reverse engineering, you'll have to design one by hand using TOAD or VISIO.

Any ERD with hundreds of objects is probably useless as a guide for developers, because it'll be unreadable with so many boxes and lines. In a database with so many objects, it's likely that there are "sub-systems" of a few dozen tables and views each. So you should make custom diagrams of these sub-systems, instead of expecting a tool to do it for you.

You can also design a pseudo-ERD, where groups of tables are represented by a single object in one diagram, and that group is expanded in another diagram.

A single ERD or set of ERD's are not sufficient to document a system of this complexity, any more than a class diagram would be adequate to document an OO system. You'll have to write a document, using the ERD's as illustrations. You need text descriptions of the meaning and use of each table, each column, and the relationships between tables (especially where such relationships are implicit instead of represented by referential integrity constraints).

All of this is a lot of work, but it will be worth it. If there's a clear and up-to-date place where the schema is documented, the whole team will benefit from it.

Bill Karwin
Bill - have you ever used Visiomodeler (Object Role Modeling), et. al.?
le dorfier
No, I've never used that. But it now seems to be old and unsupported.
Bill Karwin
It is; but there's no plug-compatible replacement for it (yet - there's at least one SourceForge project in the early stages for a VS plug-in, interestingly. I'm the type that by nature would ignore my own evangelism for it, but it is in fact a big step beyond ERD.
le dorfier
I had used Object Role Modeling. It has nothing to do with documenting already existing or newly created databases
vgv8
+2  A: 

This answer extends Kieveli's above, which I upvoted. If your version of EA supports Object Role Modeling (conceptual design, vs. logical design = ERD), reverse engineer to that and then fill out the model with the expressive richness it gives you.

The cheap and lighter-weight option is to download Visiomodeler for free from MS, and do the same with that.

The ORM (call it ORMDB) is the only tool I've ever found that supports and encourages database design conversations with non-IS stakeholders about BL objects and relationships.

Reality check - on the way to generating your DDL, it passes through a full-stop ERD phase where you can satisfy your questions about whether it does anything screwy. It doesn't. It will probably show you weaknesses in the ERD you designed yourself.

ORMDB is a classic case of the principle that the more conceptual the tool, the smaller the market. Girls just want to have fun, and programmers just want to code.

le dorfier
ORM or Object Role Modeling is explained in detail by Terry Halpin in http://tinyurl.com/8h296m
Ruben
What has ObjectRoleModeling to do with documenting existing RDBMS databases?
vgv8
One option is to use the reverse-engineering feature to extract your schema and load it - it works fine in my experience.
le dorfier
+5  A: 

In my experience, ER (or UML) diagrams aren't the most useful artifact - with a large number of tables, diagrams (especially reverse engineered ones) are often a big convoluted mess that nobody learns anything from.

For my money, some good human-readable documentation (perhaps supplemented with diagrams of smaller portions of the system) will give you the most mileage. This will include, for each table:

  • Descriptions of what the table means and how it's functionally used (in the UI, etc.)
  • Descriptions of what each attribute means, if it isn't obvious
  • Explanations of the relationships (foreign keys) from this table to others, and vice-versa
  • Explanations of additional constraints and / or triggers
  • Additional explanation of major views & procs that touch the table, if they're not well documented already

With all of the above, don't document for the sake of documenting - documentation that restates the obvious just gets in people's way. Instead, focus on the stuff that confused you at first, and spend a few minutes writing really clear, concise explanations. That'll help you think it through, and it'll massively help other developers who run into these tables for the first time.

As others have mentioned, there are a wide variety of tools to help you manage this, like Enterprise Architect, Red Gate SQL Doc, and the built-in tools from various vendors. But while tool support is helpful (and even critical, in bigger databases), doing the hard work of understanding and explaining the conceptual model of the database is the real win. From that perspective, you can even do it in a text file (though doing it in Wiki form would allow several people to collaborate on adding to that documentation incrementally - so, every time someone figures out something, they can add it to the growing body of documentation instantly).

Ian Varley
I agree with the human-readable docs, assuming there's somebody capable of writing it; my experience has been the required knowledge has left the company, which makes the case for documenation all the more apparent.
SqlACID
A: 

Well, a picture tells a thousand words so I would recommend creating ER diagrams where you can view the relationship between tables at a glance, something that is hard to do with a text-only description.

You don't have to do the whole database in one diagram, break it up into sections. We use Visual Paradigm at work but EA is a good alternative as is ERWIN, and no doubt there are lots of others that are just as good.

If you have the patience, then using html to document the tables and columns makes your documentation easier to access.

James Piggot