views:

123

answers:

2

I am designing a new laboratory database. I want all the raw results (for all tests) in one table (RESULTS). However, the numerical values vary greatly in quantity and precision. For example, some results are simply a whole number count, while other results come from highly sensitive equipment.

Which case below is ideal (and why)?

Option 1: Store each result in a raw_result column as a NUMBER with the highest precision allowed.

OR

Option 2: Create the RESULTS table and several child tables each representing results of a similar data type and a raw_result column as a NUMBER with a specific precision for that type. (ie: integer from 1 to 1000, decimal number to 6 decimal places, etc)

Also, in a related post General Oracle Data Collection Storage, I asked about the ideal way to store raw signals in this database.

+3  A: 

Use option 1. A NUMBER is a NUMBER internally to Oracle, regardless of precision and scale.

Option 2 appears to be a horrible maintenance nightmare from the get-go, and what do you do if you decide to change the precision of one of your values?

EDIT: to illustrate, consider:

    SQL> CREATE TABLE testing (n1 NUMBER(38,12), n2 NUMBER(5));

    Table created
    SQL> INSERT INTO testing VALUES (1, 1);

    1 row inserted
    SQL> INSERT INTO testing VALUES (990, 9900);

    1 row inserted
    SQL> INSERT INTO testing VALUES (1999999.999999, 1);

    1 row inserted
    SQL> SELECT n1, DUMP(n1) n1d, n2, DUMP(n2) n2d FROM testing;

                   N1 N1D                                N2 N2D
--------------------- ------------------------------ ------ --------------------
       1.000000000000 Typ=2 Len=2: 193,2                  1 Typ=2 Len=2: 193,2
     990.000000000000 Typ=2 Len=3: 194,10,91           9900 Typ=2 Len=2: 194,100
 1999999.999999000000 Typ=2 Len=8: 196,2,100,100,100      1 Typ=2 Len=2: 193,2
                      ,100,100,100                          
    SQL>

No difference in storage. The length stored is data dependent, not the precision/scale of the column definition.

This link gives a pretty good explanation of how the NUMBER type is stored. It's not as simple as comparing the values to get an idea of how much storage it will take for a given number. For example, it takes 3 bytes to store the number 990, but only 2 bytes to store 9900.

DCookie
Correct, with one note, you can only have a certain number of bytes, so different precision and scales may be needed for different numbers.
Tom Hubbard
I'm not sure what your point is here. If you're talking about prohibiting too much precision, that's one thing - Oracle can enforce that with NUMBER(p,s) definitions. But it all takes the same amount of storage.
DCookie
My fault, I was thinking if you set a scale of say 5 it would always take the 5 bytes even if you gave it an integer. Now that I think about it, I don't think that is the case.
Tom Hubbard
So if I store the number 150 into a NUMBER(38,127) column, Oracle is smart enough to know it's a small integer with no decimal places and uses only a single byte to store it?
Steven
See my edit, and try it out yourself to see.
DCookie
I'm amazed! I always suspected number data types were statically sized as in most programming languages.
Steven
It's counterintuitive in sense that the size of a NUMBER is more closely aligned with the number of significant digits stored than the magnitude of the number. Think of a NUMBER as being stored in scientific notation and it helps.
DCookie
I know that a 'single' or 'double' in C/C++ is scientific notation, but completely base-2 (not base-10). Surely, Oracle had a good reason for keeping numbers more in decimal notation.
Steven
I would have to guess that it has to do with avoiding precision issues that come with binary representations of floats.
DCookie
+1  A: 

I would say it depends how you're going to use the data later on. Option 1 will be easier to set up initially, but if you're then going to always be trying to separate the data for processing/display etc later on, then Option 2 might be easier in the long run.

DCookie has done a good job in explaining that Option 2 has no technical benefits for storage. It therefore boils down to whatever processing you're going to be doing with it.

I want all the raw results (for all tests) in one table (RESULTS).

If your feeling is that it makes most sense to put it all in one table, then do that.

P72endragon