views:

266

answers:

7

In the medical domain, a patient has a ton of exams (HbA1C, Lipid, Renal, etc..) and they all basically follow the format of ExamX(ID, ID_Patient, Date, Value)

But you can go from many tables to one by using SuperExam(ID, ID_Patient, ExamType, Date, Value)

One note, value would not be discrete in the second table. Some exams have multiple values, so they would need to be delimited. Ex value= .9,105,108,.4

I've typically used the first format, but I have seen the second structure several times too. I guess with the second version, you would have a lot more flexibility with your data without having to change the db schema, but it seems like reporting would be a nightmare.

Which is the better design?

A: 

I'd say #2. you can convert it to #1 simply by adding a "where ExamType=X" condition, and it's way easier to make queries that span multiple exams (for example, "show me all exams for Patient Bob") that would otherwise require a bunch of unions (and you already mentioned the flexibility)

EDIT: oops I misread the question. Don't do things like comma-separating values. ("I don't have to create 2 tables, because JOINS are expensive" is not a valid reason.)

Jimmy
A: 

The second form is actually more flexible because you don't have to create a new table for a new type of exam; instead you just add another row to the Exams table.

That said, you still have the problem of multiple values from one exam. Rather than using delimited values, you can split your SuperExam table into two: one for linking a patient with an exam on a particular date, and the second for providing all the values for each event, one per row. This is called "normalization". For ad-hoc querying, you use JOIN clauses to put it back together.

staticsan
A: 

May I suggest somewhat of a #3... It takes #2 and expands it...

take your SuperExamTable and get ride of ExamType and Value. Create a new table called ExamDetail which has the following values: ExamDetailId, SuperExamId, ExamType, and Value

That will allow a patient to have an exam on a specific date that can have multiple exam types and values.

J.13.L
ok... well have fun parsing your comma seperated values.
J.13.L
+4  A: 

Note that packing a single field with multiple values (eg. a comma separated list) is a violation of the first normal form rule (specifically it is not free of repeating groups).

Any database design that isn't in 1st normal form is pretty questionable. Generally the rule of thumb is design to normal forms, optimize performance, and occassionally convinience, by denormalizing.

Have a common table Examinations (ExamId, ExamType, PatientId, OccurredOn) and then specific related tables RenalExamination (ExamId, Rate, Hue), LipidExamination (ExamId, LCount, YCount), etc, might be a better choice.

It really depends on what you're doing. If this is the core of a medical labratory application then much more design work is required. Certainly I'd aim for a normalized database and then "work backwards".

+3  A: 

Ouch. Another new from-scratch medical exam database design.

From experience, there's a lot more subtlety and complexity than you describe.

Example: Just blood pressure has two measurements (at least), systolic and diastolic.

Example: Many tests are commonly given simultaneously; and their meaning comes from the interrelationships among the measurements (along with context info and textual interpretation).

Please do some research and find an existing design you can at least start from.

le dorfier
Solid advice. Any specific URL's you'd care to pass on?
I would love an example of existing designs with subtlety, but what I have are existing designs without. If you have any recommendations of places I can look with good existing designs, I would appreciate it.
Aaron
A: 

Your intuitions are right. The "SuperExam table" is called an Entity-Attribute-Value database, which is often used for clinical records, where sparse data is tied to a single entity.

An E-A-V table is easy to search. The problem isn't finding rows, it's finding related rows.

Having different tables for different entities provides domain modeling, but they also provide a weak form of metadata. In E-A-V there are no such abstractions. (The Java analogy to E-A-V would be declaring that all functions' formal arguments were of type Object -- so you'd get no type-checking.)

We can easily look up the property keys, but nothing groups these property keys. There's no way to tie all blood tests together, or to distinguish a blood test from a weight.

One compromise would be to use EAV, and make examtype a foreign key to another table that provided more data about, and/or grouping of, examtypes.

Wikipedia has a very good article on E-A-V, but read it now -- it's mostly the work of one author, and is slated for "improvement".

tpdi
+1  A: 

In all likelihood you're going to have to take the EAV approach (as described in another answer) and create a data dictionary. You would then not only be storing the specific data for an exam, but you would also have a table structure for storing data that defines the data in an exam (exam templates, relationships between values, etc.). You're basically using the relational aspect of your RDBMS to create your own relationship system.

This sounds complex--and it is, to an extent, though not as much as it might seem--but it's worth it in the long run if you intend for this medical system to be scalable and maintainable.

Do not store multiple values in one column. Ever. For any reason.

Here's a quick example...

exam:
    exam id,
    date,
    patient id,
    ...other miscellaneous scalar data thats 1-1 with an exam

exam value:
    exam value id,
    exam id,
    value id,
    value

This is your BASIC structure for storing exam data. Then you have...

value:
    value id,
    description

group:
    group id,
    description

group value:
    group value id,
    group id,
    value id,
    sort order

Again, very basic and rudimentary, but it hopefully gives you a glimpse into what I'm talking about. You may want to take it a step further and define specific exam types that have particular groups, then assign an exam type to an exam, but I'll leave that up to you.

Adam Robinson
That was really helpful - Thanks!
Aaron
@Adam: I've put a model of this together, it's pretty cool. When looking around at other stuff though, I found this link: http://stackoverflow.com/questions/408779/linq-to-sql-order-by-value-in-related-table/408806#408806 Any idea why he says EAV is not a good thing in general?
Aaron
Whether or not it's a good thing generally comes down to the application. You have to balance flexibility toward change (advantage EAV) and ability to accommodate sparse data (advantage EAV) with ease of querying and the possibility of it growing out of control (disadvantage EAV).
Adam Robinson