Category Mostly Harmless Econometrics: An Empiricist’s Companion

Appendix: Derivation of the average derivative formula

Begin with the regression of Yi on Si :

Cov(Yj, Si) _ E[h(Si)(Si – E[Si])] V(Si) _ E[Si(Si – E[Si])] ‘

Let К-ж = lim h (t). By the fundamental theorem of calculus, we have:

t—» — OO


h (si) = к_ж + / h’ (t) dt.

Substituting for h(Si), the numerator becomes


+ 1 ps

/ h’ (t) (s – E[Si)g(s)dtd.

– OO J — OO

where g(s) is the density of si at s. Reversing the order of integration, we have


+i p+i

h’ (tW (s – E[Si])g(s)dsdt.

-OO J t

The inner integral is easily seen to be equal to g, t = fE[si|si > t] — E[si|si < t]}{P(si > t)[1 — P(si > t)},

which is clearly non-negative. Setting si =Yi, the denominator can similarly be shown to be the integral of these weights...

Read More

Quantile Regression

Here’s a prayer for you. Got a pencil? . . . ‘Protect me from knowing what I don’t need to know. Protect me from even knowing that there are things to know that I don’t know. Protect me from knowing that I decided not to know about the things I decided not to know about. Amen.’ There’s another prayer that goes with it. ‘Lord, lord, lord. Protect me from the consequences of the above prayer.’

Douglas Adams, Mostly Harmless (1995)

Rightly or wrongly, 95 percent of applied econometrics is concerned with averages. If, for example, a training program raises average earnings enough to offset the costs, we are happy. The focus on averages is partly because obtaining a good estimate of the average causal effect is hard enough...

Read More

The Compliant Subpopulation

The LATE framework partitions any population with an instrument into a set of three instrument-dependent subgroups, defined by the manner in which members of the population react to the instrument:

Definition 4.4.1 Compliers. The subpopulation with Dii = 1 and Doi = 0.

Always-takers. The subpopulation with D1i =Doi = 1.

Never-takers. The subpopulation with D1i =Doi = 0.

LATE is the effect of treatment on the population of compliers. The term "compliers" comes from an analogy with randomized trials where some experimental subjects comply with the randomly assigned treatment protocol (e. g., take their medicine) but some do not, while some control subjects obtain access to the experimental treatment even though they were not supposed to...

Read More

Fewer than 42 clusters

Bias from few clusters is a risk in both the Moulton and the serial correlation contexts because in both cases inference is cluster-based. With few clusters, we tend to underestimate either the serial correlation in a random shock like vst or the intra-class correlation, p, in the Moulton problem. The relevant dimension for counting clusters in the Moulton problem is the number of groups, G. In a differences-in-differences scenario where you’d like to cluster on state (or some other cross-sectional dimension), the relevant dimension for counting clusters is the number of states or cross-sectional groups...

Read More

Regression and Causality

Section 3.1.2 shows how regression gives the best (MMSE) linear approximation to the CEF. This under­standing, however, does not help us with the deeper question of when regression has a causal interpretation. When can we think of a regression coefficient as approximating the causal effect that might be revealed in an experiment?

3.2.1 The Conditional Independence Assumption

A regression is causal when the CEF it approximates is causal. This doesn’t answer the question, of course. It just passes the buck up one level, since, as we’ve seen, a regression inherits it’s legitimacy from a CEF...

Read More


Derivation of Equation (4.6.8)

Rewrite equation (4.6.7) as follows

Yij = p* + – kqt і + (^o + nxjSj + v f,

where tі =Si — Sj. Since tі and Sj are uncorrelated by construction, we have:

Подпись: P1 = ^0 =


^o + ^1.
C (t і, Yjj)
V (t і)

Simplifying the second line,

C [(si Sj ),Yij ]

[V(Si) – V(Sj)]

C (Si, Yij )j


I" C (S j, Yij )1

V(Sj) 1

V(Si) J

[v(Si) – V(Sj)J

і V(Sj) J

[v(Si) – V(Sj)J

= РоФ + Pl(l – Ф) = Pi + Ф(Ро – Pi)

where ф V(s’) . Solving for ‘K1, we have

Г V (Si)-V (Sj) & 15

^1 = p1 _ ^0 = ф(р1 _ p0).

Figure 4.6.2: Distribution of the OLS, 2SLS, and LIML estimators with 20 instruments


Figure 4.6.3: Distribution of the OLS, 2SLS, and LIML estimators with 20 worthless instruments


Derivation of the approximat...

Read More