Category Mostly Harmless Econometrics: An Empiricist’s Companion

Appendix: Derivation of the simple Moulton factor

image367

Write

and

image368 image369

where Lg is a column vector of ng ones and G is the number of groups. Note that

Подпись: 2

Подпись: Xg i'g фg Lg x'g
Подпись: ae ng T g xg xg

Let tg = 1 + (ng — 1 )p, so we get

Подпись: X 'ФХ =^2X ng t g xg Xg •

With this in hand, we can write

V(b) = (X’X) 1 X’ФХ (X’X) 1

Подпись:X ngxgx’g) X ngtgxgx’g X ngxgx’g

ggg

image375 Подпись: 1

We want to compare this with the standard OLS covariance estimator

If the group sizes are equal, ng = n and tg = t = 1 + (n — 1)p, so that

Подпись: 1

V (b) = ^T ( X nxg x’gI X nxg xg ( X nxg x’g I

g / g V g /

Подпись:nxgxg

g

which implies (8.2.4).

Table 8.1.1: Monte Carlo results for robust standard errors

Empirical 5%

Rejection Rates

Mean

Standard

Normal

t

Deviation

(1)

(2)

(3)

(4)

A. Lots of Heteroskedasticity

І і

-0.001

0.586

Standard Errors:

Read More

The Omitted Variables Bias Formula

The omitted variables bias (OVB) formula describes the relationship between regression estimates in models with different sets of control variables. This important formula is often motivated by the notion that a longer regression, i. e., one with more controls such as equation (3.2.9), has a causal interpretation, while a shorter regression does not. The coefficients on the variables included in the shorter regression are therefore said to be "biased". In fact, the OVB formula is a mechanical link between coefficient vectors that applies to short and long regressions whether or not the longer regression is causal...

Read More

Parallel Worlds: Fixed Effects, Differences-in-differences, and Panel Data

The first thing to realize about parallel universes… is that they are not parallel.

Douglas Adams, Mostly Harmless (1995)

The key to causal inference in chapter 3 is control for observed confounding factors. If important confounders are unobserved, we might try to get at causal effects using IV as discussed in Chapter 4. Good instruments are hard to find, however, so we’d like to have other tools to deal with unobserved confounders. This chapter considers a variation on the control theme: strategies that use data with a time or cohort dimension to control for unobserved-but-fixed omitted variables. These strategies punt on comparisons in levels, while requiring the counterfactual trend behavior of treatment and control groups to be the same...

Read More

IV and causality

We like to tell the IV story in two iterations, first in a restricted model with constant effects, then in a framework with unrestricted heterogeneous potential outcomes, in which case causal effects must also be heterogeneous. The introduction of heterogeneous effects enriches the interpretation of IV estimands, without changing the mechanics of the core statistical methods we are most likely to use in practice (typically, two – stage least squares). An initial focus on constant effects allows us to explain the mechanics of IV with a minimum of fuss.

To motivate the constant-effects setup as a framework for the causal link between schooling and wages, suppose, as before, that potential outcomes can be written

Y si = fi (s) ;

and that

fi (s) = ^0 + KlS + Vi, (4.1.1)

as in the introduction ...

Read More

Censored Quantile Regression

Quantile regression allows us to look at features of the conditional distribution of Yi when part of the distribution is hidden. Suppose you have have data of the form

Yi;obs — Yi * l[Yi < c]; (7T.5)

where Yi;0bs is what you get to see and Yі is the variable you would like to see. The variable Yi;0bs is censored – information about Yi in Yi;0bs is limited for confidentiality reasons or because it was too difficult or time-consuming to collect more information. In the CPS, for example, high earnings are topcoded to protect respondent confidentiality. This means data above the topcode are recoded to have the topcode value...

Read More

Counting and Characterizing Compliers

We’ve seen that, except in special cases, each instrumental variable identifies a unique causal parameter, one specific to the subpopulation of compliers for that instrument. Different valid instruments for the same causal relation therefore estimate different things, at least in principle (an important exception being
instruments that allow for perfect compliance on one side or the other). Although different IV estimates are "weighted-up" by 2SLS to produce a single average causal effect, over-identification testing of the sort discussed in Section 4.2.2, where multiple instruments are validated according to whether or not they estimate the same thing, is out the window in a fully heterogeneous world.

Подпись: 28Differences in compliant sub-populations might explain variability in treatment effects...

Read More