Alternatives to TSLS

There are several alternatives to the standard IV/TSLS estimator. Among them is the limited information maximum likelihood (LIML) estimator, which was first derived by Anderson and Rubin (1949). There is renewed interest in LIML because evidence indicates that it performs better than TSLS when instruments are weak. Several modifications of LIML have been suggested by Fuller (1977) and others. These estimators are unified in a common framework, along with TSLS, using the idea of a k-class of estimators. LIML suffers less from test size aberrations than the TSLS estimator, and the Fuller modification suffers less from bias. Each of these alternatives will be considered below.

In a system of M simultaneous equations let the endogenous variables be y1,y2,… ,yM. Let there be K exogenous variables x1,x2,… ,xK. The first structural equation within this system is

yi = a2y2 + віхі + в2х2 + ei (11.7)

The endogenous variable y2 has reduced form y2 = n12x1 + n22x2 +——- + nK2xK + v2 = E (y2) + v2,

which is consistently estimated by least squares. The predictions from the reduced form are

E (У2) = П12Х1 + П22Х2 +—- + ПК2ХК (11.8)

and the residuals are v2 = y2 — E (y2).

The two-stage least squares estimator is an IV estimator using E (y2) as an instrument. A k-class estimator is an IV estimator using instrumental variable y2 — kv2. The LIML estimator uses k = l where I is the minimum ratio of the sum of squared residuals from two regressions. The explanation is given on pages 468-469 of POE4. A modification suggested by Fuller (1977) that uses the k-class value

(11.9)

where K is the total number of instrumental variables (included and excluded exogenous variables) and N is the sample size. The value of a is a constant-usually 1 or 4. When a model is just identified, the LIML and TSLS estimates will be identical. It is only in overidentified models that the two will diverge. There is some evidence that LIML is indeed superior to TSLS when instruments are weak and models substantially overidentified.

With the Mroz data we estimate the hours supply equation

hours = в1 + e2mtr + вз educ + e4kidsl6 + e5nwifeinc + e (11.10)

A script can be used to estimate the model via LIML. The following one is used to replicate the results in Table 11B.3 of POE4.

1 open "@gretldirdatapoemroz. gdt"

2 square exper

3 series nwifeinc = (faminc-wage*hours)/1000

4 smpl hours>0 —restrict

5 list x = mtr educ kidsl6 nwifeinc const

6 list z1 = educ kidsl6 nwifeinc const exper

7 list z2 = educ kidsl6 nwifeinc const exper sq_exper largecity

8 list z3 = kidsl6 nwifeinc const mothereduc fathereduc

9 list z4 = kidsl6 nwifeinc const mothereduc fathereduc exper

10

10 tsls hours x; z1 –liml

11 tsls hours x; z2 –liml

12 tsls hours x; z3 –liml

13 tsls hours x; z4 –liml

LIML estimation uses the tsls command with the –liml option. The results from LIML estima­tion of the hours equation, (11.10) the fourth model in line 14, are given below. The variables mtr and educ are endogenous, and the external instruments are mothereduc, fathereduc, and exper; two endogenous variables with three external instruments suggests that the model is overidentified in this specification.

LIML, using observations 1-428
Dependent variable: hours
Instrumented: mtr educ

Smallest eigenvalue = 1.00288
LR over-identification test: %2(1) = 1.232 [0.2670]

The LIML results are easy to replicate using matrix commands. Doing so reveals some of hansl’s power.

matrix y1 = { hours, mtr, educ }

matrix w = { kidsl6, nwifeinc, const, exper, mothereduc, fathereduc}

matrix z = { kidsl6, nwifeinc, const}

matrix Mz = I(\$nobs)-z*invpd(z’*z)*z’

matrix Mw = I(\$nobs)-w*invpd(w’*w)*w’

matrix Ez= Mz*y1

matrix W0 = Ez’*Ez

matrix Ew = Mw*y1

matrix W1 = Ew’*Ew

matrix G = inv(W1)*W0

matrix l = eigengen(G, null)

scalar minl = min(l)

printf "nThe minimum eigenvalue is %.8f n",minl

matrix X = { mtr, educ, kidsl6, nwifeinc, const }

matrix y = { hours }

matrix kM = (I(\$nobs)-(minl*Mw))

matrix b =invpd(X’*kM*X)*X’*kM*y

a=rownames(b, " mtr educ kidsl6 nwifeinc const ") printf "nThe liml estimates are n %.6f n", b

The equations that make this magic are found in Davidson and MacKinnon (2004, pp. 537-538).

Hansl’s straightforward syntax makes translating the algebra into a computation quite easy. The result from the script is:

The liml estimates are mtr -19196.516697 educ -197.259108 kidsl6 207.553130 nwifeing -104.941545 const 18587.905980

which matches the ones produced by gretl’s tsls with —liml option.

Fuller’s modification relies on a user chosen constant and makes a small change in k of the k-class estimator. In the script that ends the chapter, the value of a is set to 1 and the model is reestimated using Fuller’s method. The modification is quite simple to make and the chapter ending script shows the actual details.

11.4 Script

1 set echo off

2 open "@gretldirdatapoetruffles. gdt"

3 # reduce form estimation

4 list x = const ps di pf

5 ols q x

6 ols p x

7

7 # demand and supply of truffles

8 open "@gretldirdatapoetruffles. gdt"

9 list x = const ps di pf

10 tsls q const p ps di; x

11 tsls q const p pf; x

13

12 # Hausman test

13 ols p x

14 series v = \$uhat

15 ols q const p pf v

16 omit v

19

17 # supply estimation by OLS

18 ols q const p pf

22

19 # Fulton Fish

20 open "@gretldirdatapoefultonfish. gdt"

21 #Estimate the reduced form equations

22 list days = mon tue wed thu

 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77

list z = const stormy days ols lquan z ols lprice z omit days —quiet

tsls lquan const lprice days ; z

# LIML

open "@gretldirdatapoemroz. gdt" square exper

series nwifeinc = (faminc-wage*hours)/1000 smpl hours>0 —restrict list x = mtr educ kidsl6 nwifeinc const list z1 = educ kidsl6 nwifeinc const exper

list z2 = educ kidsl6 nwifeinc const exper sq_exper largecity

list z3 = kidsl6 nwifeinc const mothereduc fathereduc

list z4 = kidsl6 nwifeinc const mothereduc fathereduc exper

# LIML using tsls

tsls hours x; z1 —liml tsls hours x; z2 —liml tsls hours x; z3 –liml tsls hours x; z4 –liml tsls hours x; z4

# LIML using matrices

matrix y1 = { hours, mtr, educ }

matrix w = { kidsl6, nwifeinc, const, exper, mothereduc, fathereduc}

matrix z = { kidsl6, nwifeinc, const}

matrix Mz = I(\$nobs)-z*invpd(z’*z)*z’

matrix Mw = I(\$nobs)-w*invpd(w’*w)*w’

matrix Ez= Mz*y1

matrix W0 = Ez’*Ez

matrix Ew = Mw*y1

matrix W1 = Ew’*Ew

matrix G = inv(W1)*W0

matrix l = eigengen(G, null)

scalar minl = min(l)

printf "nThe minimum eigenvalue is %.8f n",minl

matrix X = { mtr, educ, kidsl6, nwifeinc, const }

matrix y = { hours }

matrix kM = (I(\$nobs)-(minl*Mw))

matrix b =invpd(X’*kM*X)*X’*kM*y

a=rownames(b, " mtr educ kidsl6 nwifeinc const ") printf "nThe liml estimates are n %.6f n", b

# Fuller’s Modified LIML a=1

scalar fuller_l=minl-(1/(\$nobs-cols(w))) printf "nThe minimum eigenvalue is %.8f n",minl matrix X = { mtr, educ, kidsl6, nwifeinc, const } matrix y = { hours }

78 matrix kM = (I(\$nobs)-(fuller_l*Mw))

79 matrix b =invpd(X’*kM*X)*X’*kM*y

80 a=rownames(b, " mtr educ kidsl6 nwifeinc const ")

81 printf "nThe liml estimates using Fuller a=1 n %.6f n", b

82 tsls hours mtr educ kidsl6 nwifeinc const ; z4 —liml