Appendix: More on fixed effects and lagged dependent vari­ables

To simplify, we ignore covariates and year effects and assume there are only two periods, with treatment equal to zero for everyone in the first period (the punch line is the same in a more general setup). The causal effect of interest, P, is positive. Suppose first that treatment is correlated with an unobserved individual effect, aj, and that outcomes can be described by

Подпись: (5.4.1)Y it — aj + PDj t + "it-

where "jt is serially uncorrelated, and uncorrelated with aj and Djt. We also have

Y it— 1 — aj + £jt-i,

where aj and £jt_1 are uncorrelated. You mistakenly estimate the effect of Djt in a model that controls for Yjt_1 but ignores fixed effects. The resulting estimator has probability limit, where Djt — Djt —

7Yjt_i is the residual from a regression of Djt on Yjt_1.

Now substitute a = Yjt-i — "it-1 in (5.4.1) to get

Yit = Y it— 1 + Pd it + "it — "it— 1 ■

From here, we get

Cov(Yit, Dit) = _ Cov("it-i, Dit) = _ Cov("it-i, Dit – 7Yit_i) = _ + 7a;2

V (Dit) =P V (Dit) =P V (Dit) =P V (Dit):

where ct;2 is the variance of "it_1. Since trainees have low Yit_i, 7 < 0 and the resulting estimate of P is too small.

Suppose instead that treatment is determined by low Yit_1. The correct specification is a simplified version of (5.3.3), say

Y it = + 0Yit-i + PDit + "it; (5.4.2)

where "it is serially uncorrelated. You mistakenly estimate a first-differenced equation in an effort to kill fixed effects. This ignores lagged dependent variables. In this simple example, where Dit_i = 0 for everyone, the first-differenced estimator has probability limit

Подпись: (5.4.3)C°V(yit Yit — 1; Dit Dit — 1) C°V(yit Yit — 1; Dit)

V(Dit – Dit— 1) V(Dit)

Subtracting Yit_i from both sides of (5.4.2), we have

Y it — Y it – 1 = ^ + (6 — l)Y it— 1 + P Dit + "it:

Подпись: COV(Y it Y it — 1; Dit) V (Dit) Подпись: P + (o -1) Подпись: COV(Yit— 1, Dit) V (Dit)

Substituting this in (4.2.2), the inappropriately differenced model yields

In general, we think в is a number between zero and one, otherwise Yit is non-stationary (i. e., an explosive time series process). Therefore, since trainees have low Yit_1; the estimate of P in first differences is too big.


Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>