# The Censored Regression Model

Suppose one is interested in the amount one is willing to spend on the purchase of a durable good. For example, a car. In this case, one would observe the expenditures only if the car is bought, so

y* = х’в + Ui if y* > 0 (13.50)

where xi denotes a vector of household characteristics, such as income, number of children or education. y* is a latent variable, in this case the amount one is willing to spend on a car. We observe yi = y* only if y* > 0 and we set yi = 0 if y* < 0. The censoring at zero is of course arbitrary, and the ui’s are assumed to be IIN(0,a2). This is known as the Tobit model after Tobin (1958). In this case, we have censored observations since we do not observe any y* that is negative. All we observe is the fact that this household did not buy a car and a corresponding vector xi of this household’s characteristics. Without loss of generality, we assume that the first n1 observations have positive y* ’s and the remaining n0 = n — n1 observations have non-positive y*’s. In this case, OLS on the first ni observations, i. e., using only the positive observed y*’s would be biased since ui does not have zero mean. In fact, by omitting observations for which y* < 0 from the sample, one is only considering disturbances from (13.50) such that ui > —хф. The distribution of these u’s is a truncated normal density given in Figure 13.2. The mean of this density is not zero and is dependent on в, a2 and xi. More formally, the regression function can be written as:

E(y*/xi, y* > 0) = x’ie + E[ui/y* > 0] = х[в + E[ui/ui > —х’ф] (13.51)

= xie + <JYi for i = 1, 2,…,n1

where Yi = Ф(—zi)/[1 — Ф(—zi)] and zi = xiв/a. See Greene (1993, p. 685) for the moments of a truncated normal density or the Appendix to this chapter. OLS on the positive y* ’ s omits the second term in (13.51), and is therefore biased and inconsistent.

A simple two-step can be used to estimate (13.51). First, we define a dummy variable di, which takes the value 1 if y* is observed and 0 otherwise. This allows us to perform probit estimation on the whole sample, and provides us with a consistent estimator of (в/a). Also, P[di = 1] = P[y* > 0] = P[ui > —xie] and P[di = 0] = P[y* < 0] = P[ui < —x’ie]. Therefore,

the likelihood function is given by

t = n?=i[P (ui <—xie)]1-di [P (ui > —xi e)]di (13.52)

= 7=1 ^(zi)di [1 — T(zi)]1-di where Zi = xie/a

Table 13.14 Multinomial Logit Results: Problem Drinking

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent verygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseoutcome(1)  Number of obs Wald chi2(20) Prob > chi2 Pseudo R2

Log likelihood = -3217.481

 y Coef. Std. Err. z P> |z| [95% Conf. Interval] 2 alc90th .1270931 .21395 0.59 0.552 -.2922412 .5464274 ue88 .0458099 .051355 0.89 0.372 -.0548441 .1464639 age .1617634 .0663205 2.44 0.015 .0317776 .2917492 agesq -.0024377 .0007991 -3.05 0.002 -.004004 -.0008714 schooling -.0092135 .0245172 -0.38 0.707 -.0572664 .0388393 married .4004928 .1927458 2.08 0.038 .022718 .7782677 famsize .0622453 .0503686 1.24 0.217 -.0364753 .1609659 white .0391309 .1705625 0.23 0.819 -.2951653 .3734272 excellent 2.91833 .4486757 6.50 0.000 2.038942 3.797719 verygood 2.978336 .4505932 6.61 0.000 2.09519 3.861483 good 2.493939 .4446815 5.61 0.000 1.622379 3.365499 fair 1.460263 .4817231 3.03 0.002 .5161027 2.404422 northeast .0849125 .2374365 0.36 0.721 -.3804545 .5502796 midwest .0158816 .2037486 0.08 0.938 -.3834583 .4152215 south .1750244 .2027444 0.86 0.388 -.2223474 .5723962 centercity -.2717445 .1911074 -1.42 0.155 -.6463081 .1028192 othermsa -.0921566 .1929076 -0.48 0.633 -.4702486 .2859354 q1 .422405 .1978767 2.13 0.033 .0345738 .8102362 q2 -.0219499 .2056751 -0.11 0.915 -.4250657 .3811659 q3 -.0365295 .2109049 -0.17 0.862 -.4498954 .3768364 cons -6.113244 1.427325 -4.28 0.000 -8.910749 -3.315739 3 alc90th -.1534987 .1395003 -1.10 0.271 -.4269144 .1199169 ue88 -.0954848 .033631 -2.84 0.005 -.1614004 -.0295693 age .227164 .0409884 5.54 0.000 .1468282 .3074999 agesq -.0030796 .0004813 -6.40 0.000 -.0040228 -.0021363 schooling .0890537 .0152314 5.85 0.000 .0592008 .1189067 married .7085708 .1219565 5.81 0.000 .4695405 .9476012 famsize .0622447 .0332365 1.87 0.061 -.0028975 .127387 white .7380044 .1083131 6.81 0.000 .5257147 .9502941 excellent 3.702792 .1852415 19.99 0.000 3.339725 4.065858 verygood 3.653313 .1894137 19.29 0.000 3.282069 4.024557 good 2.99946 .1786747 16.79 0.000 2.649264 3.349656 fair 1.876172 .1885159 9.95 0.000 1.506688 2.245657 northeast .088966 .1491191 0.60 0.551 -.203302 .3812341 midwest .1230169 .1294376 0.95 0.342 -.130676 .3767099 south .4393047 .1298054 3.38 0.001 .1848908 .6937185 centercity -.2689532 .1231083 -2.18 0.029 -.510241 -.0276654 othermsa .0978701 .1257623 0.78 0.436 -.1486195 .3443598 q1 -.0274086 .1286695 -0.21 0.831 -.2795961 .224779 q2 -.110751 .126176 -0.88 0.380 -.3580514 .1365494 q3 -.0530835 .1296053 -0.41 0.682 -.3071052 .2009382 cons -6.237275 .8886698 -7.02 0.000 -7.979036 -4.495515

(y==1 is the base outcome)

and once в/a is estimated, we substitute these estimates in zi and Yi given below (13.51) to get 7j. The second step is to estimate (13.51) using only the positive y*’s with yi substituted for Yi. The resulting estimator of в is consistent and asymptotically normal, see Heckman (1976, 1979).

Alternatively, one can use maximum likelihood procedures to estimate the Tobit model. Note that we have two sets of observations: (i) the positive y* ’s with yi = y*, for which we can write the density function N(х[в, а2), and (ii) the non-positive y* ’ s for which we assign yi = 0 with probability

Pr[yi = 0] = Pr[y* < 0] = Pr[«i < – х’ф] = Ф(-х’ів/а) = 1 – Ф(х[в/а) (13.53)

The probability over the entire censored region gets assigned to the censoring point. This allows us to write the following log-likelihood:

log^ = -(1/2) ЕЩ! log(2^2) – (1/2a2) ЕГ=і(Уі – х’в)2 (13.54)

+ ”=ni+1 log[1 – ф(хів/а)]

Differentiating with respect to в and a2, see Maddala (1983, p. 153), one gets dl°g£/de = ЕГІі(Уі – хів)хі/а2 – E”=„1+i фіхі/а[1 – фі]

dlog£/da2 = YZі(Уі – хів)2/2а4 – (m/2a2) ^™=„i+i фгх’гв/[2а3(1 – Фі)]

where Фі and фі are evaluated at zi = хiв/a.

Premultiplying (13.55) by в’/2a2 and adding the result to (13.56), one gets

aMLE = Е”=і(Уі – хів)Уі/пі = Y((Yi – Хів)/т

where Y1 denotes the n1 x 1 vector of non-zero observations on yi, X1 is the n1 x k matrix of values of хі for the non-zero yi’s. Also, after multiplying throughout by a, (13.55) can be written as:

-X0+ X1 (Yi – Xi0)/a = 0 (13.58)

where Xo denotes the n0 x k matrix of хі’ s for which yi is zero, y0 is an n0 x 1 vector of Yi’ s = фі/[1 – Фі] evaluated at zi = хів/a for the observations for which yi = 0. Solving (13.58) one

gets  Pmle = (XiXi)-1 XiYi – a(XiXi)“1X0Yo

Note that the first term in (13.59) is the OLS estimator for the first n observations for which y* is positive.

One can use the Newton-Raphson procedure or the method of scoring, for the second deriva­tives of the log-likelihood, see Maddala (1983, pp. 154-156). These can be computed with the tobit command in Stata. Note that for the Tobit specification, both в and a2 are identified. This is contrasted to the logit and probit specifications where only the ratio (в/a2) is identified. Wooldridge (2009, Chapter 17) recommends one obtain the estimates of (в/a2) from a probit and comparing those with the Tobit estimates generated by dividing /3 by 32. If these estimates are different or have different signs, then the Tobit estimation may not be appropriate. Problem 13.17 illustrates the Tobit estimation for married women labor supply example using the Mroz (1987) data.

Maddala warns that the Tobit specification is not necessarily the right specification every time we have zero observations. It is applicable only in those cases where the latent variable can, in principle, take negative values and the observed zero values are a consequence of censoring and non-observability. In fact, one cannot have negative expenditures on a car, negative hours of work or negative wages. However, one can enter employment and earn wages when one’s observed wage is larger than the reservation wage. Let y* be the difference between observed wage and reservation wage. Only if y* is positive will wages be observed. Final warning: The Tobit specification is heavily reliant on the normality and homoskedasticity assumptions. Failure of these assumptions leads to misleading inference.