Two-Stage Least Squares

To perform Two-Stage Least Squares (TSLS) or Instrumental Variables (IV) estimation you need instruments that are correlated with your independent variables, but not correlated with the errors of your model. In the wage model, we will need some variables that are correlated with education, but not with the model’s errors. We propose that mother’s education (mothereduc) is suitable. The mother’s education is unlikely to enter the daughter’s wage equation directly, but it is reasonable to believe that daughters of more highly educated mothers tend to get more education themselves. These propositions can and will be be tested later. In the meantime, estimating the wage equation using the instrumental variable estimator is carried out in the following example. First, load the mroz. gdt data into gretl. Then, to open the basic gretl dialog box that computes the IV estimator choose Model>Instrumental Variables>Two-Stage Least Squares from the pull-down menu as shown below in Figure 10.1. This opens the dialog box shown in Figure 10.2.

Figure 10.1: Two-stage least squares estimator from the pull-down menus

In this example we choose l_wage as the dependent variable, put all of the desired instruments into the Instruments box, and put all of the independent variables, including the one(s) measured with error, into the Independent Variables box. If some of the right-hand side variables for the model are exogenous, they should be referenced in both lists. That’s why the const, exper, and sq_exper variables appear in both places. Press the OK button and the results are found in Table 10.1. Notice that gretl ignores the sound advice offered by the authors of your textbook and computes an R2. Keep in mind, though, gretl computes this as the squared correlation between observed and fitted values of the dependent variable, and you should resist the temptation to interpret R2 as the proportion of variation in l_wage accounted for by the model.

If you prefer to use a script, the syntax is very simple.

TSLS, using observations 1-428
Dependent variable: Lwage
Instrumented: educ

Instruments: const mothereduc exper sq_exper

Coefficient

Std. Error

z

p-value

const

0.198186

0.472877

0.4191

0.6751

educ

0.0492630

0.0374360

1.3159

0.1882

exper

0.0448558

0.0135768

3.3039

0.0010

sq_exper

-0.000922076

0.000406381

-2.2690

0.0233

Mean dependent var 1.190173 S. D. dependent var 0.723198

Sum squared resid 195.8291 S. E. of regression 0.679604

R2 0.135417 Adjusted R2 0.129300

F(3,424) 7.347957 P-value(F) 0.000082

Log-likelihood -3127.203 Akaike criterion 6262.407

Schwarz criterion 6278.643 Hannan-Quinn 6268.819

Table 10.1: Results from two-stage least squares estimation of the wage equation.

t3l3

depvar indepvars; instruments —vcv (print covariance matrix)

—robust (robust standard errors)

—liir. l (use Limited Information Maximum Likelihood)

—gram (use the Generalized Method of Moments)

tsls yl 0 y2 y3 xl x2 ; 0 xl x2 хЗ x4 x5 x6

The basic syntax is this: tsls y x ; z, where y is the dependent variable, x are the regressors, and z the instruments. Thus, the gretl command tsls calls for the IV estimator to be used and it is followed by the linear model you wish to estimate.

The script for the example above is

1 list x = const educ exper sq_exper

2 list z = const exper sq_exper mothereduc

3 tsls l_wage x ; z

In the script, the regressors for the wage equation are collected into a list called x. The instruments, which should include all exogenous variables in the model including the constant, are placed in the list called z. Notice that z includes all of the exogenous variables in x. Here the dependent variable, y, is replaced with its actual value from the example, (l_wage).

It is certainly possible to compute two-stage least squares in two steps, but in practice it is not a good idea to do so. The estimates of the slopes and intercept will be the same as you get using the regular tsls IV estimator. The standard errors will not be computed correctly though. To demonstrate, we will do the estimation in two steps and compare results. The gretl code to do two step estimation is

1 smpl wage>0 —restrict

2 ols educ z

3 series educ_hat = $yhat

Notice that the sample had to be restricted to those wages greater than zero using the —restrict option. If you fail to do this, the first stage regression will be estimated with all 753 observations instead of the 428 used in tsls. TSLS is implicitly limiting the first stage estimation to the non­missing values of l_wage. You can see that the coefficient estimates are the same as those in Table 10.1, but the standard errors are not.

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>