Derivation of Equation (4.6.8)

Rewrite equation (4.6.7) as follows

Yij = p* + – kqt і + (^o + nxjSj + v f,

where tі =Si — Sj. Since tі and Sj are uncorrelated by construction, we have:

Подпись: P1 = ^0 =


^o + ^1.
C (t і, Yjj)
V (t і)

Simplifying the second line,

C [(si Sj ),Yij ]

[V(Si) – V(Sj)]

C (Si, Yij )j


I" C (S j, Yij )1

V(Sj) 1

V(Si) J

[v(Si) – V(Sj)J

і V(Sj) J

[v(Si) – V(Sj)J

= РоФ + Pl(l – Ф) = Pi + Ф(Ро – Pi)

where ф V(s’) . Solving for ‘K1, we have

Г V (Si)-V (Sj) & 15

^1 = p1 _ ^0 = ф(р1 _ p0).

Figure 4.6.2: Distribution of the OLS, 2SLS, and LIML estimators with 20 instruments


Figure 4.6.3: Distribution of the OLS, 2SLS, and LIML estimators with 20 worthless instruments



Derivation of the approximate bias of 2SLS

Start from the last equality in (4.6.20):

EPSLS – P] « [E (i’Z’Zi) + E (£’PzC)]-1 E (C’Pzv) •

The magic of linear algebra helps us simplify this expression: The term C’PzC is a scalar and therefore equal to its trace; the trace is a linear operator which passes through expectations and is invariant to cyclic permutations; finally, the trace of Pz, an idempotent matrix, is equal to it’s rank, Q. Using these facts, we have

E(C’PzC) = E [tr (c’PzC)] = E [tr (PzCC’)] = tr (PzE [CC’])

= tr (Pzct2/)

= CT2tr (Pz )

= ct2Q,

where we have assumed that Ci is homoskedastic. Similarly, applying the trace trick to C’PzV shows that this term is equal to av£ Q. Therefore,

Подпись: E [P2SLS - P][E (i’Z’Zi) + ct2q] 1 E [tr (C’Pzv)]


Подпись: /т2 image254 Подпись: 1

avi= Q [E (i’Z’Zi) + ct2q]

Multivariate first-stage F-statistics

Assume any exogenous covariates have been partialled out of the instrument list and that there are two endogenous variables, x1 and *2 with coefficients ^1 and 62. We are interested in the bias of the 2SLS estimator of 62 when x1 is also treated as endogenous. The second stage equation is

y = PzжА + PzЖ262 + [v + A – Pz*1)61 + A – Pz*2)62]- (4.7.1)

where Pzж1 and PzЖ2 are the first-stage fitted values from regressions of x1 and Ж2 on Z. By the usual anatomy formula for multivariate regression, 62 in (4.7.1) is the bivariate regression of y on the residual from
a regression of Pz*2 on Pz*1. This residual is

[I – Pzxi(x^Pz*1) 1x’1Pz]Pz*2 = PzX2,

where M1z = [I — Pzx1(x’1Pz*1)~1*’1Pz] is the relevant residual-maker matrix. In addition, note that Miz Pz X2 = Pz [M1Z Х2].

From here we conclude that the 2SLS estimator of ^2 is the OLS regression on Pz [M1zX2], in other words, OLS on the fitted values from a regression of M1zX2 on Z. This is the same as 2SLS using Pz to instrument M1zX2. So the 2SLS estimator of ^2 can be written

[*2 M1z Pz M1z X2]-1×2 M1z Pz У = ^2 + [*2M^ Pz M^ *2r1*2M1z Pz r

The explained sum of squares (numerator of the F-statistic) that determines the bias of the 2SLS estimator of ^2 is therefore the expectation of [*2M1zPzM1z*2], while the bias comes from the fact that the expectation E[£’ M1zPzrj] is non-zero when ц and £ are correlated.

Here’s how to compute this F-statistic in practice: (a) Regress the first stage fitted values for the regressor of interest, Pz*2, on the other first-stage fitted values and any exogenous covariates. Save the residuals from this step; (b) Construct the F-statistic for excluded instruments in a first-stage regression of the residuals from (a) on the excluded instruments. Note that you should get the 2SLS coefficient of interest in a 2SLS procedure where the residuals from (a) are instrumented using Z, with no other covariates or endogenous variables. Use this fact to check your calculation.


Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>