# Appendix: Bias from Measurement Error

You’ve dreamed of running the regression

Yi=v + fiS* + eh {6.6)

but data », on the regressor of your dreams, are unavailable. You see only a mismeasured version, S;-. Write the relationship between observed and desired regressors as

Si = S* + mh (6.7)

where m{ is the measurement error in S;-. To simplify, assume errors average to zero and are uncorrelated with f and the residual, e;-. Then we have

E[md = 0

C{S*,mi) = C{ei, mi) = 0.

These assumptions describe classical measurement error (jazzier forms of measurement

error may rock your regression coefficients even more).

The regression coefficient you’re after, jв in equation (6.6T is given by

p V(,sf)

Using the mismeasured regressor, Sb instead of f, you get

where Pb has a subscript “b” as a reminder that this coefficient is biased.

To see why jBb is a biased version of the coefficient you’re after, use equations (6.61 and (6.71 to substitute for Yt and S;- in the numerator of equation (6.81:

, CiYhSi)

C(i* + fiS* ST – fiMj)

V{\$,)

_c(g + i-?; + ^,s;) y<s;)

V{Si) %<\$)’

The next-to-last equals sign here uses the assumption that measurement error, mb is

С* г*

uncorrelated with f and e,-; the last equals sign uses the fact that f is uncorrelated with a

constant and with e{, since the latter is a residual from a regression on f. We’ve also used

the fact that the covariance of f with itself is its variance (see the appendix to Chapter 2 for an explanation of these and related properties of variance and covariance).

We’ve assumed that m{ is uncorrelated with <. Because the variance of the sum of uncorrelated variables is the sum of their variances, this implies which means we can write

(6.9)

where

1/(5;) 14s;)

ґ_ V(Si) ~ V(S*) + V(mi) is a number between zero and one.

The fraction r describes the proportion of variation in S;- that is unrelated to mistakes and is called the reliability of Sb Reliability determines the extent to which measurement error attenuates jBb. The attenuation bias in jBb is

= – r)j3,

so that Pb is smaller than (a positive) j8 unless r = 1, and there’s no measurement error after all.

In Section 6.1. we noted that the addition of covariates to a model with mismeasured regressors tends to exacerbate attenuation bias. The Twinsburg story told in Section 6.2 is a special case of this, where the covariates are dummies for families in samples of twins. To see why covariates increase attenuation bias, suppose the regression of interest is

k;=« + ^s; +у (бло)

where X{ is a control variable, perhaps IQ or another test score. We know from regression

s*

anatomy that the coefficient on * in this model is given by

fj — – v )

V (Sf)

where s* is the residual from a regression of on X{. Likewise, replacing with S;-, the coefficient on S,- becomes

С(Г„5,) V(Si) ’

where si is the residual from a regression of S;- on X{.

Add the (classical) assumption that measurement error, m{, is uncorrelated with the covariate, X{. Then the coefficient from a regression of mismeasured S;- on X{ is the same as the coefficient from a regression of L^’ on X{ (use the properties of covariance and the definition of a regression coefficient to see this). This in turn implies that

Applying the logic used to establish equation f6.91. we get

cpg, §j)

V(SP)

1/(5*)

—,—л— Vim,)

where

Like r, this lies between zero and one.

5* r*

What’s new here? The variance of J is necessarily reduced relative to that of »,

s*

because the variance of J is the variance of a residual from a regression model in which is the dependent variable. Since ‘/^ < V^S‘ we also have

Wg) g v(ff)

V(S*) + V{mi) < V(S*) + V{mi)

This explains why adding covariates to a model with mismeasured schooling aggravates attenuation bias in estimates of the returns to schooling. Intuitively, this aggravation is a consequence of the fact that covariates are correlated with accurately measured schooling while being unrelated to mistakes. The regression-anatomy operation that removes the influence of covariates therefore reduces the information content of a mismeasured regressor while leaving the noise component—the mistakes—unchanged (test your understanding of the formal argument here by deriving equation (6.11)). This argument carries over to the differencing operation used to purge ability from equation (6.41: differencing across twins removes some of the signal in schooling, while leaving the variance of the noise unchanged.

IV Clears Our Path

Without covariates, the IV formula for the coefficient on S;- in a bivariate regression is

where Z; is the instrument. In Section 6.2. for example, we used cross-sibling reports to

instrument for possibly mismeasured self-reported schooling. Provided the instrument is uncorrelated with the measurement error and the residual, e;-, in equations like (6.6). IV eliminates the bias due to mismeasured S;-.

To see why IV works in this context, use equations (6.61 and (6.71 to substitute for Yt and S,- in equation (6.121:

_C{YiiZi) _С(9 + 0% + ъ, Ъ)

PlV C(5J f Zj) C(S* + mit Z()

_ PC (ST, Zi) + C(eifZi)

" C{S*,Zi) + C{mhZi) *

Our discussion of the mistakes in Karen and Sharon’s reports of one another’s schooling assumes that C(e;-, Z;) = C(m;-, Z;) = 0. This in turn implies that

This happy conclusion comes from our assumption that the only reason Z; is correlated with wages is because it’s correlated with S‘. Since S‘ = si + mi, and m{ is unrelated to Zf, the usual IV miracle goes through.

po: That is severely cool.

Rung Fu Panda 2

1 See “‘I’m Just a Late Bloomer’: Britain’s Oldest Student Graduates with a Degree in Military Intelligence Aged 91,” The Daily Mail, May 21, 2012.

– Mincer’s work appears in his landmark book, Schooling, Experience, and Earnings, Columbia University Press and the National Bureau of Economic Research, 1974.

– The relationship between experience and earnings described by these estimates reflects a gradual decline in earnings

Xf

growth with age. To see this, suppose we increase X from a value x to x + 1. The term X increases by 1, while increases by

(x 4* 1)1 – x1 * It + 1.

The net effect of a 1-year experience increase is therefore

{.flB 1 X 1) — (.0012 ■ (A + Г» = OS — .0024Л-.

The first year of experience is therefore estimated to boost earnings by almost 8% while the tenth year of experience increases earnings by only about 5.6%. In fact, the experience profile, as this relationship is called, flattens out completely after about 30 years of experience.

i Zvi Griliches, “Estimating the Returns to Schooling—Some Econometric Problems,” Econometrica, vol. 45, no. I, January 1977, pages 1-22.

– Attentive readers will notice that potential experience, itself a downstream consequence of schooling, also falls under the category of bad control. In principle, the bias here can be removed by using age and its square to instrument potential experience and its square. As in the studies referenced in the rest of this chapter, we might also simply replace the experience control with age, thereby targeting a net schooling effect that does not adjust for differences in potential experience.

– Orley Ashenfelter and Alan B. Krueger, “Estimates of the Economic Returns to Schooling from a New Sample of Twins,” American Economic Review, vol. 84, no. 5, December 1994, pages 1157-1173, and Orley Ashenfelter and Cecilia Rouse, “Income, Schooling, and Ability: Evidence from a New Sample of Identical Twins,” Quarterly Journal of Economics, vol. 113, no. 1, February 1998, pages 253-284.

1 Estimates of this differenced model can also be obtained by adding a dummy for each family to an undifferenced model fit in a sample that includes both twins. Family dummies are like selectivity-group dummies in equation (2,2) in Chapter 2 and state dummies in equation 15,51 in Section 5,2. With only two observations per family, models estimated after differencing across twins within families to produce a single observation per family generate estimates of the returns to schooling identical to those generated by “dummying out” each family in a pooled sample that includes both twins.

– Daron Acemoglu and Joshua D. Angrist, “How Large Are Human-Capital Externalities? Evidence from Compulsory-Schooling Laws,” in Ben S. Bernanke and Kenneth Rogoff (editors), NBER Macroeconomics Annual 2000, vol. 15, MIT Press, 2001, pages 9-59.

– Joshua D. Angrist and Alan B. Krueger, “Does Compulsory School Attendance Affect Schooling and Earnings?” Quarterly Journal of Economics, vol. 106, no. 4, November 1991, pages 979-1014.

– Kasey Buckles and Daniel M. Hungerman, “Season of Birth and Later Outcomes: Old Questions, New Answers,” NBER Working Paper 14573, National Bureau of Economic Research, December 2008. See also John Bound, David A.

Jaeger, and Regina M. Baker, who were the first to caution that IY estimates using QOB instruments might not have a causal interpretation in “Problems with Instrumental Variables Estimation When the Correlation between the Instruments and the Endogeneous Explanatory Variable Is Weak,” Journal of the American Statistical Association, vol. 90, no. 430, June 1995, pages 443-450.

– For more on this point, see Joshua D. Angrist and Alan B. Krueger, “The Effect of Age at School Entry on Educational Attainment: An Application of Instrumental Variables with Moments from Two Samples,” Journal of the American Statistical Association, vol. 87, no. 418, June 1992, pages 328-336.

– Damon Clark and Paco Martorell, “The Signaling Value of a High School Diploma,” Journal of Political Economy, vol. 122, no. 2, April 2014, pages 282-318.