# Other diagnostics

The expression for the variance of the least squares estimator in equation (12.2), for the regression model with two explanatory variables, can be extended to the multiple regression context, for all coefficients except the intercept, as

o2 1

var(h) = —————————————————– (12.9)

X x – z)2 ’

t=1

where R2 is the "R2" goodness-of-fit measure from the auxiliary regression of xtj on all other regressors. The second factor in (12.9) is called the variance-inflation factor (VIF), as it shows the effect of linear associations among the explanatory variables upon the variances of the least squares estimators, as measured by R2 Stewart (1987) proposes collinearity indexes that are the square roots of the VIFs. Fox and Monette (1992) generalize VIFs to measure variance-inflation in a subset of coefficients. Auxiliary regressions and VIFs have the same strengths and weaknesses as collinearity diagnostics. Their strength is their intuitive appeal. If R) ^ 1, then xj is in some collinear relationship. The weaknesses are, apart from the pro-centering-anti-centering debate, that we have no measure of how close Rj must be to 1 to imply a collinearity problem, and these measures cannot determine the number of near linear dependencies in the data (Belsley, 1991, p. 30). Kennedy (1998, p. 90) suggests the rule of thumb that a VIF > 10, for scaled data, indicates severe collinearity. The same critiques apply to the strategy of looking at the matrix Rc of correlations among the regressors as a diagnostic tool.

Another diagnostic that is often mentioned, though recognized as deficient, is the determinant of X’X (or X’ Xc = Rc). One or more near exact collinearities among the columns of X drive this determinant to zero. The problems with measuring collinearity this way include deciding how small the determinant must be before collinearity is judged a problem, the fact that using this measure we determine neither the number nor form of the collinearities, and the ever-present centering debate. Soofi (1990) offers an information theory-based approach for diagnosing collinearity in which the log determinant plays an important role. Unfortunately, his measures reduce the diagnosis of collinearity to the examination of a single index, which has the same flaws as the determinant.

## Leave a reply