# The Quantile Regression Model

The starting point for quantile regression is the conditional quantile function (CQF). Suppose we are interested in the distribution of a continuously-distributed random variable, Y;, with a well-behaved density (no gaps or spikes). Then the CQF at quantile т given a vector of regressors, x;, can be defined as:

Qt (Yi|Xi) = Fy(T |Xj)

where Fy(y|X;) is the distribution function for Y; conditional on X;. When т = .10, for example, Qt(Yj|Xj) describes the lower decile of Y; given X;, while т = .5 gives us the conditional median.1 By looking at changes in the CQF of earnings as a function of education, we can tell whether the dispersion in earnings goes up or down with schooling. By looking at changes in the CQF of earnings as a function of education and time, we can tell whether the relationship between schooling and inequality is changing over time.

 (y; – m (X;)) .  The CQF is the conditional-quantile version of the CEF. Recall that the CEF can be derived as the solution to a mean-squared error prediction problem,

In the same spirit, the CQF solves the following minimization problem,

Qt (Yj|Xj) = arg min E [Pt(У; – q(X;))] , (7.1.1)

q(X)

where pT(u) = (т — 1(u < 0))u is called the "check function" because it looks like a check-mark when you plot it. If т = .5, this becomes least absolute deviations because p 5(u) = 1 (sign u)u = 1 |u|. In this case, Qt(y; |X;) is the conditional median since the conditional median minimizes absolute deviations. Otherwise,
the check function weights positive and negative terms asymmetrically:

pT (u) = 1 (u > 0) • ти + 1 (u < 0) • (1 — т)u.

This asymmetric weighting generates a minimand that picks out conditional quantiles (a fact that’s not immediately obvious but can be proved with a little work; see Koenker, 2005).

As a practical tool, the CQF shares the disadvantages of the CEF with continuous or high-dimensional X;: it may be hard to estimate and summarize. We’d therefore like to boil this function down to a small set of numbers, one for each element of Xj. Quantile regression accomplishes this by substituting a linear model for q(Xj) in (7.1.1), producing

Pt = argmin E[pt(y; – X’b)] . (7.1.2)

b2 Rd

The quantile regression estimator, fiT, is the sample analog of (7.1.2). It turns out this is a linear program­ming problem that is fairly easy (for computers) to solve.

Just as OLS fits a linear model to Y; by minimizing expected squared error, quantile regression fits a linear model to Y; using the asymmetric loss function, pT (•). If QT(y;|X;) is in fact linear, the quantile regression minimand will find it (just as if the CEF is linear, OLS will find it). The original quantile regression model, introduced by Koenker and Bassett (1978), was motivated by the assumption that the CQF is linear. As it turns out, however, the assumption of a linear CQF is unnecessary – quantile regression is useful whether or not we believe this.

Before turning to a more general theoretical discussion of quantile regression, we illustrate the use of this tool to study the wage distribution. The motivation for the use of quantile regression to look at the wage distribution comes from labor economists’ interest in the question of how inequality varies conditional on covariates like education and experience (see, e. g., Buchinsky, 1994). The overall gap in earnings by schooling group (e. g., the college/high-school differential) grew considerably in the 1980s and 1990s. Less clear, however, is how the wage distribution has been changing within education and experience groups. Many labor economists believe that increases in so-called "within-group inequality" provide especially strong evidence of fundamental changes in the labor market, not easily accounted for by changes in institutional features like the percent of workers who belong to labor unions.

Table 7.1.1: Quantile regression coefficients for schooling in the 1970, 1980, and 2000 Censuses

 Census Obs. Desc. Stats. Quantile Regression Estimates OLS Estimates Mean SD 0.1 0.25 0.5 0.75 0.9 Coeff. Root MSE .074 .074 .068 .070 .079 .072 1980 65023 6.4 0.67 (.002) (.001) (.001) (.001) (.001) (.001) 0.63 .112 .110 .106 .111 .137 .114 1990 86785 6.46 0.06 (.003) (.001) (.001) (.001) (.003) (.001) 0.64 .092 .105 .111 .120 .157 .114 2000 97397 6.5 0.75 (.002) (.001) (.001) (.001) (.004) (.001) 0.69

Notes: Adapted from Angrist, Chernozhukov, and Fernandez-Val (2006). The tables reports quantile re­gression estimates of the returns to schooling, with OLS estimates shown at the right for comparison. The sample includes US-born white and black men aged 40-49. Standard errors are reported in parentheses. All models control for race and potential experience. Sampling weights were used for the 2000 Census estimates.

Table 7.1.1 reports schooling coefficients from quantile regressions estimated using the 1980, 1990, and 2000 Censuses. The models used to construct these estimates control for race and a quadratic function of potential labor market experience (defined to be age — education — 6). The.5 quantile coefficients – for the conditional median – are very much like the OLS coefficients at the far right-hand side of the table. For example, the OLS estimate of.072 in the 1980 census is not very different from the.5 quantile coefficient of about.068 in the same data. If the conditional-on-covariates distribution of log wages is symmetric, so that the conditional median equals the conditional mean, we should expect these two coefficients to be the same. Also noteworthy is that fact that the quantile coefficients are similar across quantiles in 1980. An additional year of schooling raises median wages by 6.8 percent, with slightly higher effects on the lower and upper quartiles of.074 and.070. Although the estimated returns to schooling increased sharply between 1980 and 1990 (up to.106 at the median, with an OLS return of.114 percent), there is a reasonably stable pattern of returns across quantiles in the 1990 Census. The largest effect is on the upper decile, a coefficient of.137, while the other quantile coefficients are around.11.

We should expect to see constant coefficients across quantiles if the effect of schooling on wages amounts to what is sometimes called a "location shift.". Here, that means that as higher schooling levels raise average earnings, other parts of the wage distribution move in tandem (i. e., within-group inequality does not change). Suppose, for example, that log wages can be described by a classical linear regression model:

Уг – N(X’i3,a2£), (7.1.3)

where E[Y’|Xi] =X’/3 and Yi—X’ft = £’ is a Normally distributed error with constant variance a2. Ho – moskedasticity means the conditional distribution of log wages is no more spread out for college graduates than for high school graduates. The implications of the linear homoskedastic model for quantiles are ap­parent from the fact that

P[Yi – X’^<a£T-1(r)|Xi] = r,

where Ф-1(г) is the inverse of the standard Normal CDF. From this we conclude that QT(Yi|Xi) =X’ifi + Ф-1 (r). In other words, apart from the changing intercept, all quantile regression coefficients are the same. The results in Table 7.1.1 for 1980 and 1990 are not too far from this stylized representation.

In contrast with the simple pattern in 1980 and 1990 Census data, quantile regression estimates from the 2000 Census differ markedly across quantiles, especially in the right tail. An additional year of schooling raises the lower decile of wages by 9.2 percent and the median by 11.1 percent. In contrast, an additional year of schooling raises the upper decile by 15.7 percent. Thus, in addition to increases in overall inequality in the 1980s and 1990s (a fact we know from simple descriptive statistics), by 2000, inequality began to increase with education as well. This development is the subject of considerable discussion among labor economists, who are particularly concerned with whether it points to fundamental or institutional changes in the labor market (see, e. g., Autor, Katz, and Kearney (2005) and Lemieux (2008)).

Again, a parametric example helps us see where an increasing pattern of quantile regression coefficients might come from. We can generate increasing quantile regression coefficients by adding heteroskedasticity to the classical Normal regression model, (7.1.3). Suppose that

Уг – N(X’l3,a(Xl)),

where a2(Xi) = (X’Xi)2 and A is a vector of positive coefficients such that X’Xi > 0 (perhaps proportional to 3, so that the conditional variance grows with the conditional mean).2 Then

P[yi – Xif3 < (А’Х1)ф-1(т)|Xi]= r,

with the implication that

Qt (Yi|Xi) = Xi3 + (A’Xi^-^r) = Xi[3 + Аф-1(г)]. (7.1.4)

so that quantile regression coefficients increase across quantiles.

Putting the pieces together, Table 7.1.1 neatly summarizes two stories, both related to variation in within – group inequality. First, results from the 2000 Census show inequality increasing sharply with education. The increase is asymmetric, however, and appears much more clearly in the upper tail of the wage distribution. Second, this increase is a new development. In 1980 and 1990, in contrast, schooling affected the wage distribution in a manner roughly consistent with a simple location shift.