views:

404

answers:

2

Are there any Linear Regression Function in SQL Server 2005/2008, similar to the the Linear Regression functions in Oracle ?

+3  A: 

To the best of my knowledge, there is none. Writing one is pretty straightforward, though. The following gives you the constant alpha and slope beta for y = alpha + beta * x + epsilon:

-- test data
WITH some_table(SomeID, x, y) AS
( SELECT 1, 1.0, 2.0
  UNION ALL
  SELECT 1, 2.0, 2
  UNION ALL
  SELECT 1, 3, 4),
-- linear regression query
/*WITH*/ average_estimate AS
(   SELECT SomeID
          ,avg(x) as xbar
          ,avg(y) as ybar
    FROM some_table
    GROUP BY SomeID
),
beta_estimate as
(   SELECT      pa.SomeID
               ,CASE COALESCE(SUM((x-xbar)*(x-xbar)),0)
                    WHEN 0 THEN NULL
                    ELSE SUM((x-xbar)*(y-ybar)) / SUM((x-xbar)*(x-xbar))
                END AS Beta
    FROM        some_table pd
    INNER JOIN  average_estimate pa
            ON  pa.SomeID = pd.SomeID
    GROUP BY    pa.SomeID
)
SELECT pb.SomeID
      ,ybar - xbar * pb.Beta AS Alpha
      ,pb.Beta
FROM beta_estimate pb
INNER JOIN average_estimate pa
        ON pa.SomeID = pb.SomeID
stephan
Thanks!! had to use this to solve my problem. Problem, in a broader perspective, was to get a trend line in SSRS (2005) report. This was the only way.
rao
@pavanrao: you are welcome. Added estimate for constant alpha to the query
stephan
A: 

This is an alternate method, including the equations behind it. This guy explains the derivation of the equations: http://knol.google.com/k/linus-nilsson/linear-regression-in-t-sql. His T-SQL suggestion uses cursors though. This site then has simpler T-SQL select statement: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=21948. Here's a prettified version that I used:

table
-----
X (numeric)
Y (numeric)

/* y = mx + b   
m = (nSxy - SxSy) / (nSxx - SxSx)
b = Ay - (Ax * m) 
N.B. S = Sum, A = Avg */
@n = Count(*) FROM [dbo].[table]
Select 
((@n * Sum(X*Y)) - (Sum(X) * Sum(Y)))/
((@n * Sum(X*X)) - (Sum(X) * Sum(X))) AS M,

Avg(Y) - Avg(X) *
((@n * Sum(X*Y)) - (Sum(X) * Sum(Y)))/
((@n * Sum(X*X)) - (Sum(X) * Sum(X))) AS B,
FROM [dbo].[table]  
icc97