A vast literature in social science is concerned with peer effects. Loosely speaking, this means the causal effect of group characteristics on individual outcomes. Sometimes regression is used in an attempt to uncover these effects. In practice, the use of regression models to estimate peer effects is fraught with peril. Although this is not really an IV issue per se, the language and algebra of 2SLS helps us understand why peer effects are hard to identify.
Broadly speaking, there are two types of peer effects. The first concerns the effect of group characteristics such as the average schooling in a state or city on individually-measured outcome variable. This peer effect links the average of one variable to individual outcomes as described by another variable... Read More
Many empirical studies involve variables that take on only a limited number of values. An example is the Angrist and Evans (1998) investigation of the effect of childbearing on female labor supply, discussed in
Section 3.4.2 in this chapter and in the chapter on instrumental variables, below. This study is concerned with the causal effects of childbearing on parents’ work and earnings. Because childbearing is likely to be correlated with potential earnings, the study reports instrumental variables estimates based on sibling – sex composition and multiple births, as well as OLS estimates. Almost every outcome in this study is either binary (like employment status) or non-negative (like hours worked, weeks worked, and earnings)... Read More
Sharp RD is used when treatment status is a deterministic and discontinuous function of a covariate, xj. Suppose, for example, that
1 if Xj > xo
0 if Xj < xo
where xo is a known threshold or cutoff. This assignment mechanism is a deterministic function of Xj because once we know xj we know Dj. It’s a discontinuous function because no matter how close xj gets to xo, treatment is unchanged until xj = xo.
This may seem a little abstract, so here is an example. American high school students are awarded National Merit Scholarship Awards on the basis of PSAT scores, a test taken by most college-bound high school juniors, especially those who will later take the SAT... Read More
The discussion of IV up to this point postulates a constant causal effect. In the case of a dummy variable like veteran status, this means Y^—Yoi = p for all i, while with a multi-valued treatment like schooling, this means Ysi — YS-1,i = p for all s and all i. Both are highly stylized views of the world, especially the multi-valued case which imposes linearity as well as homogeneity. To focus on one thing at a time in a heterogeneous-effects model, we start with a zero-one causal variable. In this context, we’d like to allow for treatment-effect heterogeneity, in other words, a distribution of causal effects across individuals.
Why is treatment-effect heterogeneity important? The answer lies in the distinction between the two types of validity that characterize a research design... Read More
8.2.1 Clustering and the Moulton Factor
Bias problems aside, heteroskedasticity rarely leads to dramatic changes in inference. In large samples where bias is not likely to be a problem, we might see standard errors increase by about 25 percent when moving from the conventional to the HC1 estimator. In contrast, clustering can make all the difference.
The clustering problem can be illustrated using a simple bivariate regression estimated in data with a group structure. Suppose we’re interested in the bivariate regression,
Yig — P 0 + P 1xg + eig; (8.2.1)
where Yig is the dependent variable for individual i in cluster or group g, with G groups. Importantly, the regressor of interest, xg, varies only at the group level... Read More
In Section 3.4.2, we discussed the consequences of limited dependent variables for regression models. When the dependent variable is binary or non-negative, say, employment status or hours worked, the CEF is typically nonlinear. Most nonlinear LDV models are built around a non-linear transformation of a linear latent index. Examples include Probit, Logit, and Tobit. These models capture features of the associated CEFs (e. g., Probit fitted values are guaranteed to be between zero and one, while Tobit fitted values are non-negative). Yet we saw that the added complexity and extra work required to interpret the results from latent-index models may not be worth the trouble.
An important consideration in favor of OLS is a conceptual robustness that structural models often lack... Read More