views:

76

answers:

3

Imagine this case, but with a lot more component buckets and a lot more intermediates and outputs. Many of the intermediates are calculated at the detail level, but a few things are calculated at the aggregate level:

DECLARE @Profitability AS TABLE
    (
     Cust INT NOT NULL
    ,Category VARCHAR(10) NOT NULL
    ,Income DECIMAL(10, 2) NOT NULL
    ,Expense DECIMAL(10, 2) NOT NULL
    ,Liability DECIMAL(10, 2) NOT NULL
    ,AllocatedCapital DECIMAL(10, 2) NOT NULL
    ) ;

INSERT  INTO @Profitability
VALUES  ( 1, 'Software', 100, 50, 0, 0 ) ; 
INSERT  INTO @Profitability
VALUES  ( 2, 'Software', 100, 20, 0, 0 ) ; 
INSERT  INTO @Profitability
VALUES  ( 3, 'Software', 100, 60, 0, 0 ) ; 
INSERT  INTO @Profitability
VALUES  ( 4, 'Software', 500, 400, 0, 0 ) ; 
INSERT  INTO @Profitability
VALUES  (
         5
        ,'Hardware'
        ,1000
        ,550
        ,0
        ,0 
        ) ; 
INSERT  INTO @Profitability
VALUES  (
         6
        ,'Hardware'
        ,1000
        ,250
        ,500
        ,200 
        ) ; 
INSERT  INTO @Profitability
VALUES  (
         7
        ,'Hardware'
        ,1000
        ,700
        ,500
        ,600 
        ) ; 
INSERT  INTO @Profitability
VALUES  (
         8
        ,'Hardware'
        ,5000
        ,4500
        ,2500
        ,800 
        ) ; 

WITH    ProfitView
          AS ( SELECT   Cust
                       ,Category
                       ,Income
                       ,Expense
                       ,Profit = Income - Expense
                       ,NetProfit = Income - Expense
                        - CASE WHEN Liability - AllocatedCapital > 0
                               THEN Liability - AllocatedCapital
                               ELSE 0
                          END
               FROM     @Profitability
             )
    SELECT  Cust
           ,Category
           ,Income
           ,Expense
           ,Profit
           ,NetProfit
           ,Margin = Profit / Income
           ,NetMargin = NetProfit / Income
    FROM    ProfitView ; -- NOTE I've left off the AFTER grouping formulas on this one.

WITH    ProfitView
          AS ( SELECT   Cust
                       ,Category
                       ,Income
                       ,Expense
                       ,Profit = Income - Expense
                       ,NetProfit = Income - Expense
                        - CASE WHEN Liability - AllocatedCapital > 0
                               THEN Liability - AllocatedCapital
                               ELSE 0
                          END
               FROM     @Profitability
             ),
        GROUP1
          AS ( SELECT   Category
                       ,SUM(Profit) AS Profit
                       ,SUM(NetProfit) AS NetProfit
                       ,SUM(Income) AS Income
                       ,SUM(Profit) / SUM(Income) AS Margin
                       ,SUM(NetProfit) / SUM(Income) AS NetMargin
               FROM     ProfitView
               GROUP BY Category
             ),
        GROUP2
          AS ( SELECT   GROUP1.*
                       ,NetProfit - Profit AS Exposure
               FROM     GROUP1
             )
    SELECT  *
           ,Exposure / Income AS ExposureRatio
    FROM    GROUP2 ;

WITH    ProfitView
          AS ( SELECT   Cust
                       ,Category
                       ,Income
                       ,Expense
                       ,Profit = Income - Expense
                       ,NetProfit = Income - Expense
                        - CASE WHEN Liability - AllocatedCapital > 0
                               THEN Liability - AllocatedCapital
                               ELSE 0
                          END
               FROM     @Profitability
             ),
        GROUP1
          AS ( SELECT   SUM(Profit) AS Profit
                       ,SUM(NetProfit) AS NetProfit
                       ,SUM(Income) AS Income
                       ,SUM(Profit) / SUM(Income) AS Margin
                       ,SUM(NetProfit) / SUM(Income) AS NetMargin
               FROM     ProfitView
             ),
        GROUP2
          AS ( SELECT   GROUP1.*
                       ,NetProfit - Profit AS Exposure
               FROM     GROUP1
             )
    SELECT  *
           ,Exposure / Income AS ExposureRatio
    FROM    GROUP2 ;

Notice how the same formulae have to be used at the different aggregation levels. This results in code duplication.

I have thought of using UDFs (either scalar or table valued with an OUTER APPLY, since many of the final results may share intermediates which have to be calculated at the aggregate level), but in my experience the scalar and multi-statement table-valued UDFs perform very poorly.

Also thought about using more dynamic SQL and applying the formulas by name, basically.

Any other tricks, techniques or tactics to keeping these kinds of formulae which need to be applied at different levels in sync and/or organized?

+1  A: 

Notice how the same formulae have to be used at the different aggregation levels. This results in code duplication.

If you functions were more complex, you could benefit from creating a custom CLR aggregate.

However, for such a simple function, a built-in SUM is the best.

Unlike PostgreSQL, SQL Server does not allow creating custom aggregates in a built-in language.

Quassnoi
+1  A: 

For your simplified example, I would refactor the calculations by returning the raw data (SUM(Income) and SUM(Expense)) separately in each resultset and calculate Profit and Margin in the business layer.

If that's not possible in the real case, could you make your simple example a little bit more complex so I can see what you're getting at?

I have worked on a project recently that needed complex business analysis calculations done inside queries. It turned out to be not possible to do them outside the data queries, so we eventually resorted to converting everything to dynamic SQL. This allowed us to construct macro functions to build the various parts of each query. By doing this we sacrificed readability but gained maintainability. We did not sacrifice testability because we wrote unit tests that exercised every possible code path through the macro functions, and logged each query as it was generated.

Christian Hayter
Thanks, it's good to know the things I've thought of doing are already being used as valid approaches.
Cade Roux
+1  A: 

You could separate part of the complexity in a view:

create view dbo.vw_Profit
as
SELECT  
    Cust
,   Income,
,   Expense
,   Income - Expense as Profit
FROM dbo.Profitability

That allows for slightly simpler queries:

SELECT cust, SUM(profit), SUM(Income) / SUM(Expense)
FROM dbo.vw_Profit
GROUP BY cust

The example query is hardly complex enough to warrant simplification with views. But views can be a great help with really complex queries.

Andomar
I'm sorry, I should have made clear that SUM(margin) has no meaning. If you run the code, you'll see that margin cannot simply be summed. Thus the need for it to be applied AFTER aggregation (While SUM(Profit) is effectively fine).
Cade Roux
@Cade Roux: Right! answer edited, though the view solves even less complexity now :)
Andomar
@Andomar - Right now I have a view made up of a dozen or so CTEs at the detail level, producing dozens of intermediate column results, but every different query that has to aggregate it has 3 CTEs because of the layered conditional logic in the aggregate formulae.
Cade Roux
@Cade Roux: We try to avoid CTE's because of the added complexity, but we have layers of views named `vw_0207_Profit` for the 7th subcalculation of the 2nd main calculation.
Andomar
@Andomar I find the CTEs are fairly easy to maintain - fewer database objects and you can add/drop columns along the way without a lot of pain.
Cade Roux