Category Mostly Harmless Econometrics: An Empiricist’s Companion

Generalizing LATE

The LATE theorem applies to a stripped-down causal model where a single dummy instrument is used to estimate the impact of a dummy treatment with no covariates. We can generalize this in three important ways: multiple instruments (e. g., a set of quarter-of-birth dummies), models with covariates (e. g., controls for year of birth), and models with variable and continuous treatment intensity (e. g., years of schooling). In all three cases, the IV estimand is a weighted average of causal effects for instrument-specific compliers. The econometric tool remains 2SLS and the interpretation remains fundamentally similar to the basic LATE result, with a few bells and whistles...

Read More

Heterogeneity and Nonlinearity

As we saw in the previous section, a linear causal model in combination with the CIA leads to a linear CEF with a causal interpretation. Assuming the CEF is linear, the population regression is it. In practice, however, the assumption of a linear CEF is not really necessary for a causal interpretation of regression. For one thing, as discussed in Section 3.1.2, we can think of the regression of Y; on X; and S; as providing the best linear approximation to the underlying CEF, regardless of its shape. Therefore, if the CEF is causal, the fact that regression approximates it gives regression coefficients a causal flavor. This claim is a little vague, however, and the nature of the link between regression and the CEF is worth exploring further...

Read More

Differences-in-differences: Pre and Post, Treatment and Con­trol

The fixed effects strategy requires panel data, that is, repeated observations on the same individuals (or firms or whatever the unit of observation might be). Often, however, the regressor of interest varies only at a more aggregate level such as state or cohort. For example, state policies regarding health care benefits for pregnant workers or minimum wages change across states but not within states. The source of omitted variables bias when evaluating these policies must therefore be unobserved variables at the state and year level.

To make this concrete, suppose we are interested in the effect of the minimum wage on employment, a classic question in Labor Economics. In a competitive labor market, increases in the minimum wage move us up a downward-sloping demand curve...

Read More

The Wald Estimator

The simplest IV estimator uses a single binary (0-1) instrument to estimate a model with one endogenous regressor and no covariates. Without covariates, the causal regression model is

Y i = a + pSi + (4.1.11)

where Pi and Si may be correlated. Given the further simplification that Zi is a dummy variable that equals 1 with probability p, we can easily show that

Cov(Yi, Zi) = {E[Yi|Zi = 1] – E[Yi|Zi = 0]gp(1 – p),

with an analogous formula for Cov(Si, Zi). It therefore follows that

Подпись: (4.1.12)= E[Yi|Zi = 1] – E[Yi|Zi = 0]

P = E[Si|Zi = 1] – E[Si|Zi = 0] ‘

A direct route to this result uses (4.1.11) and the fact that E[^|Zi] = 0, so we have

Подпись: (4.1.13)E[Yi|Zi] = a + pE[Si[Zi].

Solving this equation for p produces (4.1.12).

Equation (4.1.12) is the population analog of the landmark Wald (1940) estimator for a bivar...

Read More

Tricky Points

The language of conditional quantiles is tricky. Sometimes we talk about "quantile regression coefficients at the median," or "effects on those at the lower decile." But it’s important to remember that quantile coefficients tell us about effects on distributions and not on individuals. If we discover, for example, that a training program raises the lower decile of the wage distribution, this does not necessarily mean that someone who would have been poor (i. e. at the lower decile without training) is now less poor. It only means that those who are poor in the regime with training are less poor than the poor would be in a regime without training.

The distinction between making a given set of poor people richer and changing what it means to be poor is subtle...

Read More

LATE with Multiple Instruments

The multiple-instruments extension is easy to see. This is essentially the same as a result we discussed in the grouped-data context. Consider a pair of dummy instruments, Zu and Z2i. Without loss of generality, assume these dummies are mutually exclusive (if not, then we can work with a mutually exclusive set of three dummies, Zii(1—Z2i),Z2i(1—Zii), and ZiiZ2i). The two dummies can be used to construct Wald estimators. Again, without loss of generality assume monotonicity is satisfied for each with a positive first stage (if not, we can recode the dummies so this is true). Both therefore estimate a version of E[Y1i—Yoi|D1i >Doi], though the population with D1i >Doi differs for Z1i and Z2i.

Instead of Wald estimators, we can use Z1i and Z2i together in a 2SLS procedure...

Read More