The Censored Regression Model
Suppose one is interested in the amount one is willing to spend on the purchase of a durable good. For example, a car. In this case, one would observe the expenditures only if the car is bought, so
y* = х’в + Ui if y* > 0 (13.50)
where xi denotes a vector of household characteristics, such as income, number of children or education. y* is a latent variable, in this case the amount one is willing to spend on a car. We observe yi = y* only if y* > 0 and we set yi = 0 if y* < 0. The censoring at zero is of course arbitrary, and the ui’s are assumed to be IIN(0,a2). This is known as the Tobit model after Tobin (1958). In this case, we have censored observations since we do not observe any y* that is negative. All we observe is the fact that this household did not buy a car and a corresponding vector xi of this household’s characteristics. Without loss of generality, we assume that the first n1 observations have positive y* ’s and the remaining n0 = n — n1 observations have non-positive y*’s. In this case, OLS on the first ni observations, i. e., using only the positive observed y*’s would be biased since ui does not have zero mean. In fact, by omitting observations for which y* < 0 from the sample, one is only considering disturbances from (13.50) such that ui > —хф. The distribution of these u’s is a truncated normal density given in Figure 13.2. The mean of this density is not zero and is dependent on в, a2 and xi. More formally, the regression function can be written as:
E(y*/xi, y* > 0) = x’ie + E[ui/y* > 0] = х[в + E[ui/ui > —х’ф] (13.51)
= xie + <JYi for i = 1, 2,…,n1
where Yi = Ф(—zi)/[1 — Ф(—zi)] and zi = xiв/a. See Greene (1993, p. 685) for the moments of a truncated normal density or the Appendix to this chapter. OLS on the positive y* ’ s omits the second term in (13.51), and is therefore biased and inconsistent.
A simple two-step can be used to estimate (13.51). First, we define a dummy variable di, which takes the value 1 if y* is observed and 0 otherwise. This allows us to perform probit estimation on the whole sample, and provides us with a consistent estimator of (в/a). Also, P[di = 1] = P[y* > 0] = P[ui > —xie] and P[di = 0] = P[y* < 0] = P[ui < —x’ie]. Therefore,
the likelihood function is given by
t = n?=i[P (ui <—xie)]1-di [P (ui > —xi e)]di (13.52)
= 7=1 ^(zi)di [1 — T(zi)]1-di where Zi = xie/a
Table 13.14 Multinomial Logit Results: Problem Drinking
. mlogit y alc90th ue88 age agesq schooling married famsize white excellent verygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseoutcome(1)
Number of obs Wald chi2(20) Prob > chi2 Pseudo R2
Log likelihood = -3217.481
(y==1 is the base outcome)
and once в/a is estimated, we substitute these estimates in zi and Yi given below (13.51) to get 7j. The second step is to estimate (13.51) using only the positive y*’s with yi substituted for Yi. The resulting estimator of в is consistent and asymptotically normal, see Heckman (1976, 1979).
Alternatively, one can use maximum likelihood procedures to estimate the Tobit model. Note that we have two sets of observations: (i) the positive y* ’s with yi = y*, for which we can write the density function N(х[в, а2), and (ii) the non-positive y* ’ s for which we assign yi = 0 with probability
Pr[yi = 0] = Pr[y* < 0] = Pr[«i < – х’ф] = Ф(-х’ів/а) = 1 – Ф(х[в/а) (13.53)
log^ = -(1/2) ЕЩ! log(2^2) – (1/2a2) ЕГ=і(Уі – х’в)2 (13.54)
+ ”=ni+1 log[1 – ф(хів/а)]
Differentiating with respect to в and a2, see Maddala (1983, p. 153), one gets
dl°g£/de = ЕГІі(Уі – хів)хі/а2 – E”=„1+i фіхі/а[1 – фі]
dlog£/da2 = YZі(Уі – хів)2/2а4 – (m/2a2) ^™=„i+i фгх’гв/[2а3(1 – Фі)]
where Фі and фі are evaluated at zi = хiв/a.
Premultiplying (13.55) by в’/2a2 and adding the result to (13.56), one gets
aMLE = Е”=і(Уі – хів)Уі/пі = Y((Yi – Хів)/т
where Y1 denotes the n1 x 1 vector of non-zero observations on yi, X1 is the n1 x k matrix of values of хі for the non-zero yi’s. Also, after multiplying throughout by a, (13.55) can be written as:
-X0+ X1 (Yi – Xi0)/a = 0 (13.58)
where Xo denotes the n0 x k matrix of хі’ s for which yi is zero, y0 is an n0 x 1 vector of Yi’ s = фі/[1 – Фі] evaluated at zi = хів/a for the observations for which yi = 0. Solving (13.58) one
Pmle = (XiXi)-1 XiYi – a(XiXi)“1X0Yo
Note that the first term in (13.59) is the OLS estimator for the first n observations for which y* is positive.
One can use the Newton-Raphson procedure or the method of scoring, for the second derivatives of the log-likelihood, see Maddala (1983, pp. 154-156). These can be computed with the tobit command in Stata. Note that for the Tobit specification, both в and a2 are identified. This is contrasted to the logit and probit specifications where only the ratio (в/a2) is identified. Wooldridge (2009, Chapter 17) recommends one obtain the estimates of (в/a2) from a probit and comparing those with the Tobit estimates generated by dividing /3 by 32. If these estimates are different or have different signs, then the Tobit estimation may not be appropriate. Problem 13.17 illustrates the Tobit estimation for married women labor supply example using the Mroz (1987) data.
Maddala warns that the Tobit specification is not necessarily the right specification every time we have zero observations. It is applicable only in those cases where the latent variable can, in principle, take negative values and the observed zero values are a consequence of censoring and non-observability. In fact, one cannot have negative expenditures on a car, negative hours of work or negative wages. However, one can enter employment and earn wages when one’s observed wage is larger than the reservation wage. Let y* be the difference between observed wage and reservation wage. Only if y* is positive will wages be observed. Final warning: The Tobit specification is heavily reliant on the normality and homoskedasticity assumptions. Failure of these assumptions leads to misleading inference.