tags:

views:

98

answers:

1

I would like to know a way of performing ANCOVA(analysis of covariance) using Python with scipy. It is basically a statistical comparison of regression lines. I know Python can do ANOVA and it can also do regression line fitting with Scipy.stats. I'm not sure how to put those together to get an effective ANCOVA though, if it is possible.

Regards,

Shax

+2  A: 

ANCOVA can be done with regression an using dummy variables in the design matrix for the effects that depend on the categorical variable.

A simple example is at http://groups.google.com/group/pystatsmodels/browse_thread/thread/aaa31b08f3df1a69?hl=en using the OLS class from scikits.statsmodels

Relevant part of construction of design matrix xg includes group numbers/labels, x1 is continuous explanatory variable

>>> dummy = (xg[:,None] == np.unique(xg)).astype(float)
>>> X = np.c_[x1, dummy[:,1:], np.ones(nsample)]

Estimate the model

>>> res2 = sm.OLS(y, X).fit()
>>> print res2.params
[ 1.00901524  3.08466166 -2.84716135  9.94655423]
>>> print res2.bse
[ 0.07499873  0.71217506  1.16037215  0.38826843]
>>> prstd, iv_l, iv_u = wls_prediction_std(res2)

"Test hypothesis that all groups have same intercept"

>>> R = [[0, 1, 0, 0],
...      [0, 0, 1, 0]]

>>> print res2.f_test(R)
<F test: F=array([[ 91.69986847]]), p=[[  8.90826383e-17]],
df_denom=46, df_num=2>

strongly rejected because differences in intercept are very large