views:

1257

answers:

2

I have a cube in SSAS 2005 that calculates running sums over millions of records. The current implementation is using a PeriodsToDate sum function in MDX and it is horribly slow, taking 15 minutes or more to return results in some cases. There are a huge number of solutions to this problem so I am hoping someone else has already tested a few of these out and can save me the time of implementing and testing each one.

Is it preferrable to calculate running sums and sliding window type calculations in T-SQL using a cursor (please no cross join solutions from Celko, I know they look nice on paper but the cross join solutions are horribly slow as your data grows), using a script task in SSIS (or is there a better way to do this in integration services) or using MDX in Analysis Services?

Any other ideas?

A: 

The MDX I am currently using is very slow, here it is.


Sum
(
PeriodsToDate([Date of Transaction].[Date].[(All)], [Date of Transaction].[Date].CurrentMember), [Measures].[Amount - Response]
)
ShaneD
+2  A: 

Took me little while to figure this out where it would work with and without a hierarchy on the time dimension. Based on my data and my cube structure, here's what I put together:

with member [Measures].[RunningTotal] as 
'sum({[Deferred Revenue Date].[Standard Calendar].CurrentMember.Level.Item(0):[Deferred Revenue Date].[Standard Calendar].CurrentMember}, [Measures].[Deferred Revenue Amount])'
select
    {[Measures].[Deferred Revenue Amount]
    ,[Measures].[RunningTotal]
    } on columns,
    nonempty([Deferred Revenue Date].[Standard Calendar].[Date], [Measures].[Deferred Revenue Amount]) on rows
from
DW;

One trick is the nonempty() clause which will only return values in the calculated measure column if there are intersections with the dimension and measure. Without it, you'll get a bunch of NULLs in the measure column and a bunch of repeated values in the calculated measure column. This is actually a pretty handy query.

The calculated member will work regardless of the level you specify in the SELECT statement, be it date, month, or year. In my case, I tried date to come as close as I could to what you were trying to do. I wrote a query similar to yours for my cube and it ran in 43 seconds. This new one runs in about 2 seconds and we have quite a bit of data, so it should run much faster for you as well, assuming you have aggregations built, etc.

The problem with MDX is there's like eighteen ways to do something and only one or two will run well.

Good luck.

Mitch Schroeter
Thanks. This still doesn't tell me if this is where I should be doing the calculation for best performance, but does tell me how to improve my performance in the SSAS side of things. After further review I believe my initial design may be a bit off and am going to ask about design elsewhere.
ShaneD
I think this works just fine in MDX. This is the type of query MDX excels at. I don't think you will get very good results trying to do this relationally in the SQL database.
Mitch Schroeter