views:

1421

answers:

3

I need to do something like:

SELECT value_column1 
FROM table1 
WHERE datetime_column1 >= '2009-01-01 00:00:00' 
ORDER BY datetime_column1;

Except in addition to value_column1, I also need to retrieve a moving average of the previous 20 values of value_column1.

Standard SQL is preferred, but I will use MySQL extensions if necessary.

+3  A: 

This is just off the top of my head, and I'm on the way out the door, so it's untested. I also can't imagine that it would perform very well on any kind of large data set. I did confirm that it at least runs without an error though. :)

SELECT
     value_column1,
     (
     SELECT
          AVG(value_column1) AS moving_average
     FROM
          Table1 T2
     WHERE
          (
               SELECT
                    COUNT(*)
               FROM
                    Table1 T3
               WHERE
                    date_column1 BETWEEN T2.date_column1 AND T1.date_column1
          ) BETWEEN 1 AND 20
     )
FROM
     Table1 T1
Tom H.
A: 

When I had a similar problem, I ended up using temp tables for a variety of reasons, but it made this a lot easier! What I did looks very similar to what you're doing, as far as the schema goes.

Make the schema something like ID identity, start_date, end_date, value. When you select, do a subselect avg of the previous 20 based on the identity ID.

Only do this if you find yourself already using temp tables for other reasons though (I hit the same rows over and over for different metrics, so it was helpful to have the small dataset).

overslacked
I'm not sure where the temporary tables come in, I can use your solution without them. Although, it has the issue of reliance on the identity column being contiguous.
Travis Beale
The identity column being contiguous is kind of the whole point of the temp table..... In my case, I have years and years worth of data, but each month of data is processed on its own. I extract data to temp tables and perform many metrics on them. Using temp tables (or table-valued functions) made many aspects of the processing easier, in my case.
overslacked
+1  A: 

Tom H's approach will work. You can simplify it like this if you have an identity column:

SELECT T1.id, T1.value_column1, avg(T2.value_column1)
FROM table1 T1
INNER JOIN table1 T2 ON T2.Id BETWEEN T1.Id-19 AND T1.Id
Andomar
I don't know about MySQL, but in MS SQL Server that will not work. IDENTITY columns are not guaranteed to be sequential or contiguous.
Tom H.
They would be, if you don't use SET IDENTITY_INSERT ON, or delete prices? In this case, you could move the data to a temporary table with an identity column ordered by date.
Andomar
I agree with Tom. An IDENTITY (or in MySQL parlance, an auto_increment primary key) may not be sequential or contiguous. What if you delete some rows from the middle of the table? You would have gaps in the key.
Travis Beale