Regression Standard Errors and Confidence Intervals

Our regression discussion has largely ignored the fact that our data come from samples. As we noted in the appendix to the first chapter, sample regression estimates, like sample means, are subject to sampling variance. Although we imagine the underlying relationship quantified by a regression to be fixed and nonrandom, we expect estimates of this relationship to change when computed in a new sample drawn from the same population. Suppose we’re after the relationship between the earnings of college graduates and the types of colleges they’ve attended. We’re unlikely to have data on the entire population of graduates. In practice, therefore, we work with samples drawn from the population of interest...

Read More


£ il :і i.

Abbreviations and acronyms are introduced on the page indicated in parentheses.

2SLS two-stage least squares, an instrumental variables estimator that replaces the regressor being instrumented with fitted values from the first stage (p. 132)

ALS a study by Joshua D. Angrist, Victor Lavy, and Analia Schlosser on the causal link between quantity and quality of children in Israeli families (p. 127)

BLS Boston Latin School, the top school in the Boston exam school hierarchy (p. 164)

C&B College and Beyond, a data set (p. 52)

CEF conditional expectation function, the population average of Y. with X. held fixed (p. 82)

CLT Central Limit Theorem, a theorem which says that almost any sample average is approximately normally distributed, with the accuracy of the approximation increasing...

Read More

The Path from Cause to Effect

blind master PO: Close your eyes. What do you hear? young kwai chang caine: I hear the water, I hear the birds. master PO: Do you hear your own heartbeat?


master PO: Do you hear the grasshopper that is at your feet? kwai chang caine: Old man, how is it that you hear these things?

master PO: Young man, how is it that you do not?

Kung Fu, Pilot

Economists’ reputation for dismality is a bad rap. Economics is as exciting as any science can be: the world is our lab, and the many diverse people in it are our subjects.

The excitement in our work comes from the opportunity to learn about cause and effect in human affairs...

Read More

Birthdays and Funerals

katy: Is this really what you’re gonna do for the rest of your life? boon: What do you mean?

katy: I mean hanging around with a bunch of animals getting drunk every weekend.

boon: No! After I graduate, I’m gonna get drunk every night.

Animal House, 1978 … of course

Your twenty-first birthday is an important milestone. American over-2 Is can drink legally, “at last,” some would say. Of course, those under age drink as well. As we learn from the exploits of Boon and his fraternity brothers, not all underage drinking is in moderation. In an effort to address the social and public health problems associated with underage drinking, a group of American college presidents have lobbied states to return the minimum legal drinking age (MLDA) to the Vietnamera threshold of 18...

Read More

Make Me a Match, Run Me a Regression

Regression is the tool that masters pick up first, if only to provide a benchmark for more elaborate empirical strategies. Although regression is a many-splendored thing, we think of it as an automated matchmaker. Specifically, regression estimates are weighted averages of multiple matched comparisons of the sort constructed for the groups in our stylized matching matrix (the appendix to this chapter discusses a closely related connection between regression and mathematical expectation).


The key ingredients in the regression recipe are

■ the dependent variable, in this case, student z’s earnings later in life, also called the outcome variable (denoted by Ff);

■ the treatment variable, in this case, a dummy variable that indicates students who attended a private college or univers...

Read More

Appendix: Standard Errors for Regression DD

Regression DD is a special case of estimation with panel data. A state-year panel consists of repeated observations on states over time. The repetitive structure of such data sets raises special statistical problems. Economic data of this sort typically exhibit a property called serial correlation (that’s serial as in “murder,” not “breakfast”). Serially correlated data are persistent, meaning the values of variables for nearby periods are likely to be similar.

We expect serial correlation in time series data like annual unemployment rates. When a state’s unemployment rate is higher than average in one year, it’s likely to be higher than average in the next...

Read More