# Regressions Run

We start with regression estimates of the private school earnings advantage from models with no controls. The coefficient from a regression of log earnings (in 1995) on a dummy for private school attendance, with no other regressors (right-hand side variables) in the model, gives the raw difference in log earnings between those who attended a private school and everyone else (the chapter appendix explains why regression on a single dummy variable produces a difference in means across groups defined by the dummy). Not surprisingly, this raw gap, reported in the first column of Table 2.2. shows a substantial private school premium. Specifically, private school students are estimated to have earnings about 14% higher than the earnings of other students.

The numbers that appear in parentheses below the regression estimates in Table 2.2 are the estimated standard errors that go with these estimates. Like the standard errors for a difference in means discussed in the appendix to Chapter 1. these standard errors quantify the statistical precision of the regression estimates reported here. The standard error associated with the estimate in column (1) is .055. The fact that.135 is more than twice the size of the associated standard error of.055 makes it very unlikely the positive estimated private-school gap is merely a chance finding. The private school coefficient is statistically significant.

TABLE 2.2

Private school effects: Barron’s matches

 No selection controls Selection controls ■R tH (4) IfJj (6)

 Private school .135 (.055) .095 (.052) .086 (.034) .007 (.038) .003 (.039) .013 (.025) O^vn SAT score – и 100 .04S (.009) .016 (.007) .033 (.007) .001 (.007) Log parental income .219 (.022) .190 (.023) Female -.403 (.018) -.395 (.021) Block .005 (.041) -.040 (.042) Hispanic .062 (.072) .032 (.070) Asian .170 (.074) .145 (.068) Otber/niissing race -.074 (.157) -.079 (.156) High school top 10% .095 (.027) .082 (.028) High school rank missing .019 (.033) .015 (.037) Athlete .123 (.025) .115 (.027) Selectivity-group dummies No No No Yes Yes Yes
 Notes: This table reports estimates of the effect of attending a private college or university on earnings. Each column reports coefficients from a regression of log earnings on a dummy for attending a private institution and controls. The results in columns (4)-(6) are from models that include applicant selectivity-group dummies. The sample size is 5,583. Standard errors are reported in parentheses.

The large private school premium reported in column (1) of Table 2.2 is an interesting descriptive fact, but, as in our example calculation, some of this gap is almost certainly due to selection bias. As we show below, private school students have higher SAT scores and come from wealthier families than do public school students, and so might be expected to earn more regardless of where they went to college. We therefore control for measures of ability and family background when estimating the private school premium. An estimate of the private school premium from a regression model that includes an individual SAT control is reported in column (2) of Table 2.2. Every 100 points of SAT achievement are associated with about a 5 percentage point earnings gain. Controlling for students’ SAT scores reduces the measured private school premium to about.1. Adding controls for parental income, as well as for demographic characteristics related to race and sex, high school rank, and whether the graduate was a college athlete brings the private school premium down a little further, to a still substantial and statistically significant .086, reported in column (3) of the table.

A substantial effect indeed, but probably still too big, that is, contaminated by positive selection bias. Column (4) reports estimates from a model with no controls for ability, family background, or demographic characteristics. Importantly, however, the regression model used to construct the estimate reported in this column includes a dummy for each matched college selectivity group in the sample. That is, the model used to construct this estimate includes the dummy variables GROUPfor j = 1, …, 150 (the table omits the many estimated Yj this model produces, but indicates their inclusion in the row labeled “selection controls”). The estimated private school premium with selectivity-group controls included is almost bang on 0, with a standard error of about.04. And that’s not all: having killed the private school premium with selectivity-group dummies, columns (5) and (6) show that the premium moves little when controls for ability and family background are added to the model. This suggests that control for college application and admissions selectivity groups takes us a long way toward the apples-to-apples and oranges-to-oranges comparisons at the heart of any credible regression strategy for causal inference.

The results in columns (4)-(6) of Table 2.2 are generated by the subsample of 5,583 students for whom we can construct Barron’s matches and generate within-group comparisons of public and private school students. Perhaps there’s something special about this limited sample, which contains less than half of the full complement of C&B respondents. This concern motivates a less demanding control scheme that includes only the average SAT score in the set of schools students applied to plus dummies for the number of schools applied to (that is, a dummy for students who applied to two schools, a dummy for students who applied to three schools, and so on), instead of a full set of 150 selectivity-group dummies. This regression, which can be estimated in the full C&B sample, is christened the “self-revelation model” because it’s motivated by the notion that applicants have a pretty good idea of their ability and where they’re likely to be admitted. This self-assessment is reflected in the number and average selectivity of the schools to which they apply. As a rule, weaker applicants apply to fewer and to less-selective schools than do stronger applicants.

The self-revelation model generates results remarkably similar to those generated by Barron’s matches. The self-revelation estimates, computed in a sample of 14,238 students, can be seen in Table 2.3. As before, the first three columns of the table show that the raw private school premium falls markedly, but remains substantial, when controls for ability and family background are added to the model (falling in this case, from.21 to.14). At the same time, columns (4)-(6) show that models controlling for the number and average selectivity of the schools students apply to generate small and statistically insignificant effects on the order of.03. Moreover, as with the models that control for Barron’s matches, models with average selectivity controls generate estimates that are largely insensitive to the inclusion of controls for ability and family background.

Private university attendance seems unrelated to future earnings once we control for selection bias. But perhaps our focus on public-private comparisons misses the point. Students may benefit from attending schools like Ivy, Leafy, or Smart simply because their classmates at such schools are so much better. The synergy generated by a strong peer group may be the feature that justifies the private school price tag.

We can explore this hypothesis by replacing the private school dummy in the self­revelation model with a measure of peer quality. Specifically, as in the original Dale and Krueger study that inspires our analysis, we replace P, in equation (2.2) with the average

SAT score of classmates at the school attended.- Columns (l)-(3) of Table 2.4 show that students who attended more selective schools do markedly better in the labor market, with an estimated college selectivity effect on the order of 8% higher earnings for every 100 points of average selectivity increase. Yet, this effect too appears to be an artifact of selection bias due to the greater ambition and ability of those who attend selective schools. Estimates from models with self-revelation controls, reported in columns (4)-(6) of the table, show average college selectivity to be essentially unrelated to earnings.

 No selection controls Selection controls (4) (5) (6)

TABLE 2.3

Private school effects: Average SAT score controls

 Private school ,212 (.060) .152 (.057) .139 (.043) .034 (.062) .031 (.062) .037 (.039) Own SAT score 100 .051 (.008) .024 (.006) .036 (.006) .009 (.006) Log parental income .181 (.026) .159 (.025) Female -.398 (.012) -.396 (.014) Black -.003 (.031) -.037 (.035) Hispanic .027 (.052) .001 (.054) Asian .189 (.035) .155 (.037) Other/niissing race —.166 (.118) -.189 (.117) High school top 10% .067 (.020) .064 (.020) High school rank missing .003 (.025) -.008 (.023) Arhlete .107 (.027) .092 (.024) Average SAT score of schools applied to 100 .110 (.024) .082 (.022) .077 (.012) Sent two applications .071 (.013) .062 (.011) .058 (.010) Sent three applications .093 (.021) .079 (.019) .066 (.017) Sent four or more applications .139 (.024) .127 (.023) .098 (.020)

Notes: This table reports estimates of the effect of attending a private college or university on earnings. Each column shows coefficients from a regression of log earnings on a dummy for attending a private institution and controls. The sample size is 14,238. Standard errors are reported in parentheses.

TABLE 2.4

School selectivity effects: Average SAT score controls

 No selection controls Selection controls (1) m (3) (4) (5) (6)
 -0.021

Notes: This table reports estimates of the effect of alma mater selectivity on earnings. Each column shows coefficients from a regression of log earnings on the average SAT score at the institution attended and controls. The sample size is 14,238. Standard errors are reported in parentheses.