views:

176

answers:

3

I have a data structure something like this:

typedef struct tagSUB_DATA
{
    double measuredValue;
    double standardDeviation;
    double calculatedValue;
    double weightedError;

} SUB_DATA;

typedef struct tagALL_THE_DATA
{
    int aNumber;
    double aDouble;
    SUB_DATA measurements1;
    SUB_DATA measurements2;

} ALL_THE_DATA;

which I need to store in a relational database.

My query relates to the two fields, measurements1 and measurements2. Obviously, they are the same type, so my first thought was "let's create a SUB_DATA table and create a Foreign Key link between them".

Table: ALL_THE_DATA
Field: ID (int, Primary Key)
Field: aNumber (int)
Field: aDouble (double)
Field: measurements1 (int, Foreign Key referencing SUB_DATA)
Field: measurements2 (int, Foreign Key referincing SUB_DATA)

Table: SUB_DATA
Field: ID (int, Primary Key)
Field: measuredValue (double)
Field: standardDeviation (double)
Field: calculatedValue (double)
Field: weightedError (double)

However, the actual context of the data is such that measurements1 and measurements2 are measurements of different things (let's say, apples and oranges rockets), which both happen to need a measured value, standard deviation, etc. Is it still appropriate, then, to store the data for measured apples and measured rockets in the same table, even though they use the same data, or would it be more prudent to design it so that rockets and apples have their own (identically-designed) tables?

Table: ALL_THE_DATA
Field: ID (int, Primary Key)
Field: aNumber (int)
Field: aDouble (double)
Field: appleMeasurements (int, Foreign Key referencing APPLE_MEASUREMENTS)
Field: rocketMeasurements (int, Foreign Key referencing ROCKET_MEASUREMENTS)

Table: APPLE_MEASUREMENTS
Field: ID (int, Primary Key)
Field: measuredValue (double)
Field: standardDeviation (double)
Field: calculatedValue (double)
Field: weightedError (double)

Table: ROCKET_MEASUREMENTS
Field: ID (int, Primary Key)
Field: measuredValue (double)
Field: standardDeviation (double)
Field: calculatedValue (double)
Field: weightedError (double)

Which of these two solutions is best, do you think? The first seems less redundant, but may have greater potential for there being inconsistent data. Or perhaps is there a better way to solve this problem than I've thought of?

Cheers!

(Please excuse my Apples/Rockets pseudodata - I can't really post actual code here)

Extra Information:
In this case, we can be certain that rockets and apples will not change their fields later, so I'm not too worried about the case of "what if the fields in either rocket or apple change later".

A: 

Why not create the two tables, but have them inherit from a common measurement table?

CookieOfFortune
How do you inherit tables in an RDBMS?
Smashery
I guess it depends on the database you're using, but POSTGRES supports INHERIT.
CookieOfFortune
You can simulate inheritance, it's not that difficult, but it's clunky, which is why you use an ORM to do that for you.
Uri
+1  A: 

If your SUB_DATAs really are Apples and Rockets (and always Apples and Rockets), then both of them being instances of the same data structure may be a design problem and may make your code riskier to use. You may want to use subclasses (even if the structure isn't different) just so you differentiate the type.

If you mix both into the same table, you run the risk that wanting to change the structure of one of them, but not the other, just as you would if you used the same class. If that is a possibility for the future, then obviously don't share them in the same table. Or, if they share something in common and may have differences in the future, you could use multiple tables.

That may sound like a lot of work, but a decent object-relational mapping (e.g., Hibernate) can actually handle class hierarchies for you fairly nicely and split what needs splitting.

But whatever you do, don't start your database before you're 100% sure what is going to stay in common and what may change.

Uri
Hmm. Wrong advice. Start the database now, and change it till you like it. And once you learn something new, change it again.
Stephan Eggermont
I disagree in this case. If you have no idea whether or not two critical entities are the same entity or a distinct entity, it's better to think more carefully first than to refactor later. I don't consider figuring out identities and types to be premature optimizations.
Uri
In this case, we can be certain that rockets and apples will not change their fields later.
Smashery
A: 

Do you always have exactly two measurements? If so, you do not need to break the table at all (just have one table with two sets of columns for the two measurements).

Table: ALL_THE_DATA
Field: ID (int, Primary Key)
Field: aNumber (int)
Field: aDouble (double)
Field: measuredValue1 (double)
Field: standardDeviation1 (double)
Field: calculatedValue1 (double)
Field: weightedError1 (double)
Field: measuredValue2 (double)
Field: standardDeviation2 (double)
Field: calculatedValue2 (double)
Field: weightedError2 (double)

For a non-optional 1:1 relationship between structures, embedding the child structure into the parent table is acceptable designwise and efficient.

This makes it more difficult to add a third measurement type, though.

Thilo