Category Mostly Harmless Econometrics: An Empiricist’s Companion

Getting a Little Jumpy: Regression Discontinuity Designs

But when you start exercising those rules, all sorts of processes start to happen and you start to find out all sorts of stuff about people… Its just a way of thinking about a problem, which lets the shape of the problem begin to emerge. The more rules, the tinier the rules, the more arbitrary they are, the better.

Douglas Adams, Mostly Harmless (1995)

Regression discontinuity (RD) research designs exploit precise knowledge of the rules determining treat­ment. RD identification is based on the idea that in a highly rule-based world, some rules are arbitrary and therefore provide good experiments. RD comes in two styles, fuzzy and sharp. The sharp design can be seen as a selection-on-observables story. The fuzzy design leads to an instrumental-variables-type setup.

Read More

Two-Sample IV and Split-Sample IVF

GLS estimates of Г in (4.3.1) are consistent because E

The 2SLS minimand can be thought of as GLS applied to equation (4.3.1), after multiplying by /N to keep the residual from disappearing as the sample size gets large. In other words, 2SLS minimizes a quadratic form in the residuals from (4.3.1) with a (possibly non-diagonal) weighting matrix.[53] An important insight that comes from writing the 2SLS problem in this way is that we do not need the individual observations in our sample to estimate (4.3.1). Just as with the OLS coefficient vector, which can be constructed from the sample conditional mean function, IV estimators can also be constructed from sample moments. The moments needed for IV are Zy and ZNW. The dependent variable, Zy, is a vector of dimension [k+q] x 1...

Read More

The Bias of Robust Standard Errors*

Подпись: In matrix notationПодпись: 3 =і

^XlYl = (X ‘X )-1X ‘y,

i where X is the NXK matrix with rows Xi and y is the N x 1 vector of Yi’s. We saw in Section 3.1.3 that /3 has an asymptotically Normal distribution. We can write:

VN(3 – /) – n(0, n)

where П is the asymptotic covariance matrix. Repeating (3.1.7), the formula for П in this case is

Пг = E [XiXi]-1^ [XiXief] E [XiXi]-1, (8.1.1)

where ei = Yi—Xi/. When residuals are homoskedastic, П simplifies to fic = a2E[XiXi]~1 where a2 = E[e2].

We are concerned here with the bias of robust standard errors in independent samples (i. e., no clustering or serial correlation). To simplify the derivation of bias, we assume that the regressor vector can be treated as fixed in repeated samples, as it would be if we sampled stratifying on Xi...

Read More

Peer Effects

A vast literature in social science is concerned with peer effects. Loosely speaking, this means the causal effect of group characteristics on individual outcomes. Sometimes regression is used in an attempt to uncover these effects. In practice, the use of regression models to estimate peer effects is fraught with peril. Although this is not really an IV issue per se, the language and algebra of 2SLS helps us understand why peer effects are hard to identify.

Broadly speaking, there are two types of peer effects. The first concerns the effect of group characteristics such as the average schooling in a state or city on individually-measured outcome variable. This peer effect links the average of one variable to individual outcomes as described by another variable...

Read More

Limited Dependent Variables and Marginal Effects

Many empirical studies involve variables that take on only a limited number of values. An example is the Angrist and Evans (1998) investigation of the effect of childbearing on female labor supply, discussed in

Section 3.4.2 in this chapter and in the chapter on instrumental variables, below. This study is concerned with the causal effects of childbearing on parents’ work and earnings. Because childbearing is likely to be correlated with potential earnings, the study reports instrumental variables estimates based on sibling – sex composition and multiple births, as well as OLS estimates. Almost every outcome in this study is either binary (like employment status) or non-negative (like hours worked, weeks worked, and earnings)...

Read More

Sharp RD

Sharp RD is used when treatment status is a deterministic and discontinuous function of a covariate, xj. Suppose, for example, that

1 if Xj > xo

. (6.1.

0 if Xj < xo

where xo is a known threshold or cutoff. This assignment mechanism is a deterministic function of Xj because once we know xj we know Dj. It’s a discontinuous function because no matter how close xj gets to xo, treatment is unchanged until xj = xo.

This may seem a little abstract, so here is an example. American high school students are awarded National Merit Scholarship Awards on the basis of PSAT scores, a test taken by most college-bound high school juniors, especially those who will later take the SAT...

Read More