Regression Analysis of Experiments
Regression is a useful tool for the study of causal questions, including the analysis of data from experiments. Suppose (for now) that the treatment effect is the same for everyone, say Yii — Yoi = p, a constant. With
constant treatment effects, we can rewrite equation (2.1.1) in the form
Y i = a + p D; + Pi;
E(Y0i) (Yii – Y0i) Y0i – E(Yoi)
where р; is the random part of Yo;. Evaluating the conditional expectation of this equation with treatment status switched off and on gives
E [Y; ID; = 1] = a + p + E [р; ID; = 1]
E [Y; ID; =0] = a + E [р; |D; =0] ,
E[Y;|D; = 1] – E[Y;|D; = 0] =
+ E[P;|D; = 1] – E[P;|D; =0] selection bias
Thus, selection bias amounts to correlation between the regression error term, р;, and the regressor, D;. Since
E [P;|D; = 1] – E [p;|D; = 0] = E [Yo;|D; = 1] – E [Yo;|D; = 0] ,
this correlation reflects the difference in (no-treatment) potential outcomes between those who get treated and those who don’t. In the hospital allegory, those who were treated had poorer health outcomes in the no-treatment state, while in the Angrist and Lavy (1999) study, students in smaller classes tend to have intrinsically lower test scores.
In the STAR experiment, where D; is randomly assigned, the selection term disappears, and a regression of Y; on D; estimates the causal effect of interest, p. The remainder of Table 2.2.2 shows different regression specifications, some of which include covariates other than the random assignment indicator, D;. Covariates play two roles in regression analyses of experimental data. First, the STAR experimental design used conditional random assignment. In particular, assignment to classes of different sizes was random within schools, but not across schools. Students attending schools of different types (say, urban versus rural) were a bit more or less likely to be assigned to a small class. The comparison in column 1 of Table 2.2.2, which makes no adjustment for this, might therefore be contaminated by differences in achievement in schools of different types. To adjust for this, some of Krueger’s regression models include school fixed effects, i. e., a separate intercept for each school in the STAR data. In practice, the consequences of adjusting for school fixed effects is rather minor, but we wouldn’t know this without taking a look. We will have more to say about regression models with fixed effects in Chapter 5.
The other controls in Krueger’s table describe student characteristics such as race, age, and free lunch status. We saw before that these individual characteristics are balanced across class types, i. e. they are not systematically related to the class-size assignment of the student. If these controls, call them X;, are uncorrelated with the treatment D;, then they will not affect the estimate of p. In other words, estimates of p in the long regression,
Y; = a + pD; + Xjj + р; (2.3.2)
Nevertheless, inclusion of the variables X; may generate more precise estimates of the causal effect of interest. Notice that the standard error of the estimated treatment effects in column 3 is smaller than the corresponding standard error in column 2. Although the control variables, X;, are uncorrelated with Dj, they have substantial explanatory power for Y;. Including these control variables therefore reduces the residual variance, which in turn lowers the standard error of the regression estimates. Similarly, the standard errors of the estimates of p are reduced by the inclusion of school fixed effects because these too explain an important part of the variance in student performance. The last column adds teacher characteristics. Because teachers were randomly assigned to classes, and teacher characteristics appear to have little to do with student achievement in these data, both the estimated effect of small classes and it’s standard error are unchanged by the addition of teacher variables.
Regression plays an exceptionally important role in empirical economic research. Some regressions are simply descriptive tools, as in much of the research on earnings inequality. As we’ve seen in this chapter, regression is well-suited to the analysis of experimental data. In some cases, regression can also be used to approximate experiments in the absence of random assignment. But before we can get into the important question of when a regression is likely to have a causal interpretation, it is useful to review a number of fundamental regression facts and properties. These facts and properties are reliably true for any regression, regardless of your purpose in running it.