There are three specification tests you will find useful with instrumental variables estimation. By default, Gretl computes each of these whenever you estimate a model using two-stage least squares. Below I’ll walk you through doing it manually and we’ll compare the manual results to the automatically generated ones.
The first test is to determine whether the independent variable(s) in your model is (are) in fact uncorrelated with the model’s errors. If so, then least squares is more efficient than the IV estimator. If not, least squares is inconsistent and you should use the less efficient, but consistent, instrumental variable estimator. The null and alternative hypotheses are Ho : Cov(xi, ei) = 0 against Ha : Cov(xi, ei) = 0. The first step is to use least squares to estimate the first stage of TSLS
Xi = Yi + 0iZii + O2 Zi2 + Vi (10.3)
and to save the residuals, zy. Then, add the residuals to the original model
Уі = ві + в2 Xi + 5Vi + ei (10.4)
Estimate this equation using least squares and use the t-ratio on the coefficient 5 to test the hypothesis. If it is significantly different from zero then the regressor, xi is not exogenous or predetermined with respect to e and you should use the IV estimator (TSLS) to estimate ві and в2. If it is not significant, then use the more efficient estimator, OLS.
The gretl script for the Hausman test applied to the wage equation is:
open "c:Program Filesgretldatapoemroz. gdt" logs wage
list x = const educ exper sq_exper
list z2 = const exper sq_exper mothereduc fathereduc ols educ z2 —quiet series ehat2 = $uhat ols l_wage x ehat2
Notice that the equation is overidentified. There are two additional instruments, mothereduc and fathereduc, that are being used for a lone endogenous regressor, educ. Overidentification basically means that you have more instruments than necessary to estimate the model. Lines 5 and 6 of the script are used to get the residuals from least squares estimation of the first stage regression, and the last line adds these to the wage model, which is estimated by least squares. The t-ratio on ehat2 =1.671, which is not significant at the 5% level. We would conclude that the instruments are exogenous.
You may have noticed that whenever you use two-stage least squares in gretl that the program automatically produces the test statistic for the Hausman test. There are several different ways of computing this statistic so don’t be surprised if it differs from the one you compute manually using the above script.