# Artificial orthogonalization

Consider a regression model containing two explanatory variables,

y = в1 + в,*2 + вз*э + e, (12.17)

where the regressors x2 and x3 are highly correlated. To "purge" the model of collinearity regress x3 on x2, and compute the residuals x* = x3 – w3. It is argued that x* contains the information in the variable x3 after the effects of collinearity are removed. Because least squares residuals are orthogonal to the regressors, x*3 and x2 are uncorrelated, and thus collinearity is eliminated! Substituting x* into the model we obtain, y = в1 + в2X2 + в*x* + e*,

which is then estimated by OLS. Buse (1994) shows that the least squares esti­mates of p1 and p3 are unaffected by this substitution, as are their standard errors and f-statistics, and the residuals from (12.18) are identical to those from (12.17), hence statistics such as R2, the Durbin-Watson d, and 62 are unaffected by the substitution. But what about the estimator of p2? Kennedy (1982) first points out the problems with this procedure, and Buse (1994) works out the details. Buse shows that the estimator of p2 from (12.18) is biased. Furthermore, instead of gaining a variance reduction in return for this bias, Buse shows that the variance of S* can be larger than the OLS variance of b3, and he gives several examples. Thus artificial orthogonalization is not a cure for collinearity.