tags:

views:

178

answers:

1

I ran the same probit regression in SAS and R and while my coefficient estimates are (essentially) equivalent, the reported test statistics are different. Specifically, SAS reports test statistics as t-statistics whereas R reports test statistics as z-statistics.

I checked my econometrics text and found (with little elaboration) that it reports probit results in terms of t statistics.

Which statistic is appropriate? And why does R differ from SAS?

Here's my SAS code:

proc qlim data=DavesData;
 model y = x1 x2 x3/ discrete(d=probit);
run;
quit;

And here's my R code:

> model.1 <- glm(y ~ x1 + x2 + x3, family=binomial(link="probit"))
> summary(model.1)
+5  A: 

Just to answer a little bit - it's seriously off topic, question should be closed in fact - but neither the t-statistic nor the z-statistic are meaningful. They're both related though, as Z is just the standard normal distribution and T is an adapted "close-to-normal" distribution that takes into account the fact that your sample is limited to n cases.

Now, both the z and the t statistic provide the significance for the null hypothesis that the respective coefficient is equal to zero. The standard error on the coefficients, used for that test, is based on the residual error. Using the link function, you practically transform your response in such a way that the residuals become normal again, whereas in fact the residuals represent the difference between the observed and the estimated proportion. Due to this transformation, calculation of the degrees of freedom for the T-statistic isn't useful anymore and hence R assumes the standard normal distribution for the test statistic.

Both results are completely equivalent, R will just give slightly sharper p-values. It's a matter of debate, but if you look at proportion difference tests, they're also always done using the standard normal approximation (Z-test).

Which brings me back to the point that neither of these values actually has any meaning. If you want to know whether or not a variable has a significant contribution with a p-value that actually says something, you use a Chi-squared test like the Likelihood Ratio test (LR), Score test or Wald test. R just gives you the standard likelihood ratio, SAS also gives you the other two. But all three tests are essentially equivalent, if they differ seriously it's time to look again at your data.

eg in R :

anova(model.1,test="Chisq")

For SAS : see the examples here for use of contrasts, getting the LR, Score or Wald test

Joris Meys