Database design help with varying schemas

views:

159

answers:

+1 Q:

Database design help with varying schemas

I work for a billing service that uses some complicated mainframe-based billing software for it's core services. We have all kinds of codes we set up that are used for tracking things: payment codes, provider codes, write-off codes, etc... Each type of code has a completely different set of data items that control what the code does and how it behaves.

I am tasked with building a new system for tracking changes made to these codes. We want to know who requested what code, who/when it was reviewed, approved, and implemented, and what the exact setup looked like for that code. The current process only tracks two of the different types of code. This project will add immediate support for a third, with the goal of also making it easy to add additional code types into the same process at a later date. My design conundrum is that each code type has a different set of data that needs to be configured with it, of varying complexity. So I have a few choices available:

I could give each code type it's own table(s) and build them independently. Considering we only have three codes I'm concerned about at the moment, this would be simplest. However, this concept has already failed or I wouldn't be building a new system in the first place. It's also weak in that the code involved in writing generic source code at the presentation level to display request data for any code type (even those not yet implemented) is not trivial.
Build a db schema capable of storing the data points associated with each code type: not only values, but what type they are and how they should be displayed (dropdown list from an enum of some kind). I have a decent db schema for this started, but it just feels wrong: overly complicated to query and maintain, and it ultimately requires a custom query to view full data in nice tabular for for each code type anyway.
Storing the data points for each code request as xml. This greatly simplifies the database design and will hopefully make it easier to build the interface: just set up a schema for each code type. Then have code that validates requests to their schema, transforms a schema into display widgets and maps an actual request item onto the display. What this item lacks is how to handle changes to the schema.

My questions are: how would you do it? Am I missing any big design options? Any other pros/cons to those choices?

My current inclination is to go with the xml option. Given the schema updates are expected but extremely infrequent (probably less than one per code type per 18 months), should I just build it to assume the schema never changes, but so that I can easily add support for a changing schema later? What would that look like in SQL Server 2000 (we're moving to SQL Server 2005, but that won't be ready until after this project is supposed to be completed)?

[Update]:
One reason I'm thinking xml is that some of the data will be complex: nested/conditional data, enumerated drop down lists, etc. But I really don't need to query any of it. So I was thinking it would be easier to define this data in xml schemas.

However, le dorfier's point about introducing a whole new technology hit very close to home. We currently use very little xml anywhere. That's slowly changing, but at the moment this would look a little out of place.

I'm also not entirely sure how to build an input form from a schema, and then merge a record that matches that schema into the form in an elegant way. It will be very common to only store a partially-completed record and so I don't want to build the form from the record itself. That's a topic for a different question, though.

Based on all the comments so far Xml is still the leading candidate. Separate tables may be as good or better, but I have the feeling that my manager would see that as not different or generic enough compared to what we're currently doing.

Do a web search on "generalized specialized relational modeling". You'll find articles on how to set up tables that store the attributes of each kind of code, and the attributes common to all codes.

If you’re interested in object modeling, just search on “generalized specialized object modeling”.

Walter Mitty 2009-04-13 16:48:16

I did the search, found a whole lot of nothing :(

Joel Coehoorn 2009-04-13 18:39:24

I did the search, and found this very question on StackOverflow!

Bill Karwin 2009-04-13 20:10:30

Bill, yeah. I've been giving this same answer to about a dozen questions over the last few months. I need to learn how to point people to a previously asked question in Stackoverflow, the way some other people do. It can't be that hard.

Walter Mitty 2009-04-15 13:17:56

+2 A:

Why do you say "this concept has already failed or I wouldn't be building a new system in the first place"? Is it because you suspect there must be a scheme for handling them in common?

Else I'd say to continue the existing philosophy, and establish additional tables. At least it would be sharing an existing pattern and maintaining some consistency in that respect.

le dorfier 2009-04-13 16:54:37

I should clarify: it fails in that I can't easily write generic code at the presentation level to show the data for any type of request.

Joel Coehoorn 2009-04-13 16:58:10

I think more projects fail trying to write the one great generic function...

DJ 2009-04-13 17:00:57

Right - smells like overgneralization a bit. Is this homegrown softare or a commercial package? If the second, even less reason to mess with the architecture. Also, the XML option would I suspect be introducing yet another technology into this particular codebase.

le dorfier 2009-04-13 17:08:27

Okay, I'm convinced I need to look again at what we have. Right now this is home-grown, with both existing codes tracked in completely different ways/places. Management wants to consolidate this in a way that will be very easy to expand to track other kinds of code as well.

Joel Coehoorn 2009-04-13 17:16:23

Then maybe it's a good match for a framework that provides simple scaffolding - RoR and its derivates, ASP.NET MVC, etc. They'll give you autoCRUD with little effort. Anything more and you're putting more dev time into it than it will see in user time for maybe years.

le dorfier 2009-04-13 18:04:52

+4 A:

There is no simple, generic solution to a complex, meticulous problem. You can't have both simple storage and simple app logic at the same time. Either the database structure must be complex, or else your app must be complex as it interprets the data.

I outline five solution to this general problem in "product table, many kind of product, each product have many parameters."

For your situation, I would lean toward Concrete Table Inheritance or Serialized LOB (the XML solution).

The reason that XML might be a good solution is that:

You don't need to use SQL to pick out individual fields; you're always going to display the whole form.
Your XML can annotate fields for data type, user interface control, etc.

But of course you need to add code to parse and validate the XML. You should use an XML schema to help with this. In which case you're just replacing one technology for enforcing data organization (RDBMS) with another (XML schema).

You could also use an RDF solution instead of an RDBMS. In RDF, metadata is queriable and extensible, and you can model entities with "facts" about them. For example:

Payment code XYZ contains attribute TradeCredit (Net-30, Net-60, etc.)
Attribute TradeCredit is of type CalendarInterval
Type CalendarInterval is displayed as a drop-down
.. and so on

Re your comments: Yeah, I am wary of any solution that uses XML. To paraphrase Jamie Zawinski:

Some people, when confronted with a problem, think "I know, I'll use XML." Now they have two problems.

Another solution would be to invent a little Domain-Specific Language to describe your forms. Use that to generate the user-interface. Then use the database only to store the values for form data instances.

Bill Karwin 2009-04-13 18:09:49

One reason I'm thinking xml is that some of the data will be complex: nested/conditional data, enumerated drop down lists, etc. But since I don't need to query any of it I was thinking it would be easier to define in xml schemas.

Joel Coehoorn 2009-04-13 18:26:44

However, le dorfier's point about introducing a whole new technology hit very close to home. While it's slowly changing, we currently use very little xml anywhere.

Joel Coehoorn 2009-04-13 18:29:09

I'm also not entirely sure how to build an input form from a schema, and then merge a record that matches that schema into the form. It will be very common to only store a partially completed record and so I don't want to build the form from the record itself.

Joel Coehoorn 2009-04-13 18:30:27

Hmm: I should edit my question and move these comments there...

Joel Coehoorn 2009-04-13 18:30:59

I don't think an RDF is an option.

Joel Coehoorn 2009-04-13 18:35:41

ansaurus

tags:

views:

answers:

Database design help with varying schemas

related questions