Category Mostly Harmless Econometrics: An Empiricist’s Companion

Instrumental Variables in Action: Sometimes You Get What You Need

Anything that happens, happens.

Anything that, in happening, causes something else to happen, causes something else to happen.

Anything that, in happening,

causes itself to happen again, happens again.

It doesn’t necessarily do it in chronological order, though.

Douglas Adams, Mostly Harmless (1995)

Two things distinguish the discipline of Econometrics from our older sister field of Statistics. One is a lack of shyness about causality. Causal inference has always been the name of the game in applied econometrics. Statistician Paul Holland (1986) cautions that there can be “no causation without manipulation,” a maxim that would seem to rule out causal inference from non-experimental data. Less thoughtful observers fall back on the truism that “correlation is not causality...

Read More

The Quantile Regression Model

The starting point for quantile regression is the conditional quantile function (CQF). Suppose we are interested in the distribution of a continuously-distributed random variable, Y;, with a well-behaved density (no gaps or spikes). Then the CQF at quantile т given a vector of regressors, x;, can be defined as:

Qt (Yi|Xi) = Fy[101](T |Xj)

where Fy(y|X;) is the distribution function for Y; conditional on X;. When т = .10, for example, Qt(Yj|Xj) describes the lower decile of Y; given X;, while т = .5 gives us the conditional median.1 By looking at changes in the CQF of earnings as a function of education, we can tell whether the dispersion in earnings goes up or down with schooling...

Read More

IV in Randomized Trials

The language of the LATE framework is based on an analogy between IV and randomized trials. But some instruments really come from randomized trials. If the instrument is a randomly assigned offer of treatment, then LATE is the effect of treatment on those who comply with the offer but are not treated otherwise. An especially important case is when the instrument is generated by a randomized trial with one-sided non­compliance. In many randomized trials, participation is voluntary among those randomly assigned to receive treatment. On the other hand, no one in the control group has access to the experimental intervention. Since the group that receives (i. e...

Read More

Appendix: Derivation of the simple Moulton factor




image368 image369

where Lg is a column vector of ng ones and G is the number of groups. Note that

Подпись: 2

Подпись: Xg i'g фg Lg x'g
Подпись: ae ng T g xg xg

Let tg = 1 + (ng — 1 )p, so we get

Подпись: X 'ФХ =^2X ng t g xg Xg •

With this in hand, we can write

V(b) = (X’X) 1 X’ФХ (X’X) 1

Подпись:X ngxgx’g) X ngtgxgx’g X ngxgx’g


image375 Подпись: 1

We want to compare this with the standard OLS covariance estimator

If the group sizes are equal, ng = n and tg = t = 1 + (n — 1)p, so that

Подпись: 1

V (b) = ^T ( X nxg x’gI X nxg xg ( X nxg x’g I

g / g V g /



which implies (8.2.4).

Table 8.1.1: Monte Carlo results for robust standard errors

Empirical 5%

Rejection Rates










A. Lots of Heteroskedasticity

І і



Standard Errors:

Read More

The Omitted Variables Bias Formula

The omitted variables bias (OVB) formula describes the relationship between regression estimates in models with different sets of control variables. This important formula is often motivated by the notion that a longer regression, i. e., one with more controls such as equation (3.2.9), has a causal interpretation, while a shorter regression does not. The coefficients on the variables included in the shorter regression are therefore said to be "biased". In fact, the OVB formula is a mechanical link between coefficient vectors that applies to short and long regressions whether or not the longer regression is causal...

Read More

Parallel Worlds: Fixed Effects, Differences-in-differences, and Panel Data

The first thing to realize about parallel universes… is that they are not parallel.

Douglas Adams, Mostly Harmless (1995)

The key to causal inference in chapter 3 is control for observed confounding factors. If important confounders are unobserved, we might try to get at causal effects using IV as discussed in Chapter 4. Good instruments are hard to find, however, so we’d like to have other tools to deal with unobserved confounders. This chapter considers a variation on the control theme: strategies that use data with a time or cohort dimension to control for unobserved-but-fixed omitted variables. These strategies punt on comparisons in levels, while requiring the counterfactual trend behavior of treatment and control groups to be the same...

Read More