CENSORED OR TRUNCATED REGRESSION MODEL (TOBIT MODEL)

 Tobin (1958) proposed the following important model: (13.6.1) yf = Хг’р + Ui and (13.6.2) Уі = x’p + и{ if yf > о = 0 if yf >0, і = 1, 2…………………. n,

where (wj are assumed to be i. i.d. N(0, cr2) and хг is a known nonstochastic vector. It is assumed that {yj and (x,) are observed for all i, but {y*} are unobserved if y* < 0. This model is called the censored regression model or the Tobit model (after Tobin, in analogy to probit). If the observations corresponding to y* < 0 are totally lost, that is, if {x,} are not observed whenever y* < 0, and if the researcher does not know how many obser­vations exist for which y* < 0, the model is called the truncated regression model.

Tobin used this model to explain a household’s expenditure (y) on a durable good in a given year as a function of independent variables (x), including the price of the durable good and the household’s income. The above model is necessitated by the fact that there are likely to be many households for which the expenditure is zero. The variable y* may be interpreted as the desired amount of expenditure, and it is hypothesized that a household does not buy the durable good if the desired expenditure is zero or negative (a negative expenditure is not possible). The Tobit model has been used in many areas of economics. Amemiya (1985), p. 365, lists several representative applications.

If there is a single independent variable x, the observed data on у and x in the Tobit model will normally look like Figure 13.1. It is apparent there that the LS estimator of the slope coefficient obtained by regressing

У 0

figure 13.1 An example of censored data

all the y’s (including those that are zeroes) on x will be biased and inconsistent. Although not apparent from the figure, it can further be shown that the LS estimator using only the positive y’s is also biased and inconsistent.

The consistent and asymptotically efficient estimator of (3 and cr2 in the Tobit model is obtained by the maximum likelihood estimator. The like­lihood function of the model is given by

(13.6.3) L = Л [і – Ф(х’0/а)] ст_1ф[(Л ~ х’Р)/ст],

0 і

where Ф and ф are the standard normal distribution and density functions and П0 and П] stand for the product over those і for which уг = 0 and Уі > 0, respectively. It is a peculiar product of a probability and a density, yet the maximum likelihood estimator can be shown to be consistent and asymptotically normal. For the proof, see Amemiya (1973). Olsen (1978) proved the global concavity of (13.6.3).

The likelihood function of the truncated regression model can be written as

(13.6.4) L = Ф(х’р/сгГ1а~1ф[(3Іг – – х’Р)/ст].

1

Amemiya (1973) proved the consistency and the asymptotic normality of this model as well.

The Tobit maximum likelihood estimator is consistent even when {ut} are serially correlated (see Robinson, 1982). It loses its consistency, how­ever, when the true distribution of {«,} is either nonnormal or hetero – scedastic. For discussion of these cases, see Amemiya (1985, section 10.5).

Many generalizations of the Tobit model have been used in empirical research. Amemiya (1985, section 10.6) classifies them into five broad types, of which we shall discuss only Types 2 and 5.

Type 2 Tobit is the simplest natural generalization of the Tobit model (Type 1) and is defined by

(13.6.5) y*i = xiiPx + uu У*і = Х2І02 + М2 і

and

(13.6.6) y2i = yli if y*i > 0

= 0 if yfi < 0, г = 1, 2,

where (щіг иъ) are i. i.d. drawings from a bivariate normal distribution with

1 2

zero means, variances оу and <t2 . and covariance <x12. It is assumed that only the sign of yt is observed and that y% is observed only when y*t > 0. It is assumed that x^ are observed for all і but that x2i need not be observed for those і such that y*t < 0.

The likelihood function of this model is given by

(13.6.7) L = Yl Р(Уи 0) П ДУъ І УЇі > 0)P(y* > 0),

о і

where П0 and П] stand for the product over those і for which y2; = 0 and y2i Ф 0, respectively, and /(■ | у*г > 0) stands for the conditional density of & given yft > 0.

The Tobit model (Type 1) is a special case of the Type 2 Tobit, in which У*і = ytv Since a test of Type 1 versus Type 2 cannot be translated as a test about the parameters of the Type 2 Tobit model, the choice between the two models must be made in a nonclassical way. Another special case of Type 2 is when Uu and щг are independent. In this case the LS regression of the positive у2і on x2i yields the maximum likelihood estimators of p2 and cr2, while the probit maximum likelihood estimator applied to the first equation of (13.6.5) yields the maximum likelihood estimator of fV°Y

In the work of Gronau (1973), y* represents the offered wage minus the reservation wage (the lowest wage the worker is willing to accept) and y* represents the offered wage. Only when the offered wage exceeds the reservation wage do we observe the actual wage, which is equal to the offered wage. In the work of Dudley and Montmarquette (1976), yf, signifies the measure of the U. S. inclination to give aid to the ith country, so that aid is given if yf, is positive, and yf, determines the actual amount of aid.

Type 5 Tobit is defined by

(13.6.8) yf = x’p7+ Uji,

Zji S;,Y7 + Vji,

Уі = yti if zti = max zf, j = 1, 2, . .. і = 1, 2, . . ., n,

l

where yi, •s. ji, sji are observed. It is assumed that (u]t, v]{) are i. i.d. across і but may be correlated across j, and for each і and j the two random variables may be correlated with each other. Their joint distribution is variously specified by researchers. In some applications the maximum in

(13.6.8) can be replaced by the minimum without changing the essential features of the model.

The likelihood function of the model is given by

(13.6.9) L = f(y*i I ziі is the maximum)Pjj

і

X zf, is the maximum )P2,. . .

2

X П M I Zji is the maximum)Py;,

J

where Пу is the product over those і for which z;* is the maximum and Pji = P(z*i is the maximum).

In the model of Lee (1978), J = 2, у = z, and z^ represents the wage rate of the ith worker in case he joins the union and zf, in case he does not. The researcher observes the actual wage rate yu which is the greater of the two. (We have slightly simplified Lee’s model.) The disequilibrium model defined by (13.3.14), (13.3.15), and (13.3.16) becomes this type if we assume sample separation. In the model of Duncan (1980), zf is the net profit accruing to the ith firm from the plant to be built in the yth
location, and y* is the input-output vector at the jth location. In the model of Dubin and McFadden (1984), z* is the utility of the jth portfolio of the electric and gas appliances of the fth household, and y* (vector) consists of the gas and electricity consumption associated with the jxh portfolio in the ith household.