tags:

views:

1325

answers:

6

I have this MySQL query:

SELECT DAYOFYEAR(`date`)  AS d, COUNT(*) 
FROM  `orders` 
WHERE  `hasPaid` > 0
GROUP  BY d
ORDER  BY d

Which returns something like this:

d  | COUNT(*) |
20 |  5       |
21 |  7       |
22 | 12       |
23 |  4       |

What I'd really like is another column on the end to show the running total:

d  | COUNT(*) | ??? |
20 |  5       |   5 |
21 |  7       |  12 |
22 | 12       |  24 |
23 |  4       |  28 |

Is this possible?

+5  A: 

I would say that this is impossible every resulting row should be independent. Use programming language for getting these values

Sergej Andrejev
Given the nature of relational math, and the fact that you're using group by, even if mysql has some hack to make this possible, it would be less convoluted to just do it in a programming language as Sergej suggests.
Jarret Hardie
I would disagree. Splitting the processing tasks between the database and the application layer is problematic from a reuse and maintenance perspective. If you want to use this data in different places, maybe on a report and on a screen, you'd have to duplicate the running totals logic.
cdonner
+1 you're right: this would be easier and better overall in the programming logic - I was trying to see if there was some magic awesome function to do it.
nickf
When you have a considerable amount of data you have to compromise some purity, and also, in this case it really doesn't look like true "logic" to me, it could be seen just as a "visual aid", there is no real business logic in accumulate values.
Sam
Agree with Sam. A report and a screen are both view code. The re-usable "logic" should be encapsulated in the view layer... application design notwithstanding.
Jarret Hardie
+2  A: 
SELECT 
   DAYOFYEAR(O.`date`)  AS d, 
   COUNT(*),
   (select count(*) from `orders` 
       where  DAYOFYEAR(`date`) <= d and   `hasPaid` > 0)
FROM  
  `orders` as O
WHERE  
  O.`hasPaid` > 0
GROUP  BY d
ORDER  BY d

This will require some syntactical tuning (I don't have MySQL to test it), but it shows you the idea. THe subquery just has to go back and add up everything fresh that you already included in the outer query, and it has to do that for every row.

Take a look at this question for how to use joins to accomplish the same.

To address concerns about performance degradation with growing data: Since there are max. 366 days in a year, and I assume that you are not running this query against multiple years, the subquery will get evaluated up to 366 times. With proper indices on the date and the hasPaid flag, you'll be ok.

cdonner
thanks - this works perfectly as is.
nickf
Be aware that this will be extremely slow on big, average and some of the small databases, because it needs to do as many additional queries as there will be rows in result
Sergej Andrejev
Agree. I +1'd this answer because it is clever, and we've all used solutions like this when needed, but we are also all aware there is a cost. Depends on where you need the running count. For the business logic? Then maybe do this in the DB. For the view? Do it in code.
Jarret Hardie
+1  A: 

Unless you have no other option but doing this in sql, I'd sum the results in the programming language that is making the query. A nesting like this will became very slow as the table grows.

Sam
A: 

You can hack this using the Cross Join statement or some slef joins but it will get slow with any large data sets so probably best done in a post query processor; either cursor of in client code

TFD
A: 

This is one of the only places where cursors are faster than a set based queries, if performance is critical I would either

  • Do this outside of MySql or
  • Use MySql 5 Cursors
Sam Saffron
+3  A: 
Autosoft
that works brilliantly! Looking at the `EXPLAIN` on this shows it to be much more efficient than the previously accepted answer
nickf