views:

60

answers:

2

What is the "best" (correct, standard, etc) way to maintain a log of data acquired at a set rate (every minute, every 5 seconds, every 10ms, etc) in an Oracle database?

It seems inefficient to store the 7 byte DATE value for every datapoint (especially as the frequency increases). However, packing the data into some type of raw format makes statistics and other calculations on the data more difficult.

I guess this question is general enough to apply to any RDBMS, but in this case I'm using Oracle.

A: 

Since each row of data collected has to stand on its own, you have to use the space to record the complete DATE value - unless you choose to use something like a Unix timestamp (integer seconds since 1970-01-01 00:00:00Z, or some other suitable epoch or reference point). That fits into 4 bytes to give a 68-year period on either side of the epoch (assuming signed 32-bit integers). It may not be quite as convenient, but it is relatively compact.

Jonathan Leffler
Why bother? It is a tiny amount of space savings. "not be quite as convenient" seems like an understatement to me. Using any of the SQL date functions now requires manual conversions, and requires special indexing strategies.
RussellH
I wouldn't bother either, but I'm not the person asking the question, who does seem to be concerned about space. Storing a shorter quantity saves disk space; even with terabytes of disk, if you've enough data coming in, 3 bytes overhead on a small enough row size can add up. If the logged information is big enough that the 3 byte saving is negligible, then yes, use the full DATE type.
Jonathan Leffler
The querying need not be intolerably difficult with appropriate functions in place. It depends on what Oracle provides anyway, and I'm not sufficiently au fait with the ins and outs of Oracle's date/time functions to give a detailed answer.
Jonathan Leffler
+2  A: 

How much does it cost for a terabyte of disk, and is compacting those 7 bytes really worth the effort? If you want to calculate stats and reports on the logs based on time, its going to be very painful to uncompact the date to use in SQL queries.

With Oracle just log the data to a table - try not to log too much or have too many indexes on the log table. Make sure your log table is partitioned from day 1 to managable sizes - that could be a partition per day, week or month depending on how much data you are generating. Design your housekeeping policy from day 1 too.

When you are adding a new partition at the end of your 'period' when the data starts going into the new partition, you could consider using 'alter partition move compress' to compress the data to store it online in less space.

There are a lot of options, you just need to think through the requirements you have to try and find the best solution. Depending on what you are doing, logging to a file could be an option too - but beware of thousands and thousands of files in a single directory which can cause trouble too.

Stephen ODonnell