The Censored Regression Model

Suppose one is interested in the amount one is willing to spend on the purchase of a durable good. For example, a car. In this case, one would observe the expenditures only if the car is bought, so

y* = х’в + Ui if y* > 0 (13.50)

where xi denotes a vector of household characteristics, such as income, number of children or education. y* is a latent variable, in this case the amount one is willing to spend on a car. We observe yi = y* only if y* > 0 and we set yi = 0 if y* < 0. The censoring at zero is of course arbitrary, and the ui’s are assumed to be IIN(0,a2). This is known as the Tobit model after Tobin (1958). In this case, we have censored observations since we do not observe any y* that is negative. All we observe is the fact that this household did not buy a car and a corresponding vector xi of this household’s characteristics. Without loss of generality, we assume that the first n1 observations have positive y* ’s and the remaining n0 = n — n1 observations have non-positive y*’s. In this case, OLS on the first ni observations, i. e., using only the positive observed y*’s would be biased since ui does not have zero mean. In fact, by omitting observations for which y* < 0 from the sample, one is only considering disturbances from (13.50) such that ui > —хф. The distribution of these u’s is a truncated normal density given in Figure 13.2. The mean of this density is not zero and is dependent on в, a2 and xi. More formally, the regression function can be written as:

E(y*/xi, y* > 0) = x’ie + E[ui/y* > 0] = х[в + E[ui/ui > —х’ф] (13.51)

= xie + <JYi for i = 1, 2,…,n1

where Yi = Ф(—zi)/[1 — Ф(—zi)] and zi = xiв/a. See Greene (1993, p. 685) for the moments of a truncated normal density or the Appendix to this chapter. OLS on the positive y* ’ s omits the second term in (13.51), and is therefore biased and inconsistent.

A simple two-step can be used to estimate (13.51). First, we define a dummy variable di, which takes the value 1 if y* is observed and 0 otherwise. This allows us to perform probit estimation on the whole sample, and provides us with a consistent estimator of (в/a). Also, P[di = 1] = P[y* > 0] = P[ui > —xie] and P[di = 0] = P[y* < 0] = P[ui < —x’ie]. Therefore,

the likelihood function is given by

t = n?=i[P (ui <—xie)]1-di [P (ui > —xi e)]di (13.52)

= 7=1 ^(zi)di [1 — T(zi)]1-di where Zi = xie/a

Table 13.14 Multinomial Logit Results: Problem Drinking

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent verygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseoutcome(1)

Подпись: Multinomial logistic regressionПодпись: 9822 1276.47 0.0000 0.1655 Number of obs Wald chi2(20) Prob > chi2 Pseudo R2

Log likelihood = -3217.481

y

Coef.

Std. Err.

z

P> |z|

[95% Conf. Interval]

2

alc90th

.1270931

.21395

0.59

0.552

-.2922412

.5464274

ue88

.0458099

.051355

0.89

0.372

-.0548441

.1464639

age

.1617634

.0663205

2.44

0.015

.0317776

.2917492

agesq

-.0024377

.0007991

-3.05

0.002

-.004004

-.0008714

schooling

-.0092135

.0245172

-0.38

0.707

-.0572664

.0388393

married

.4004928

.1927458

2.08

0.038

.022718

.7782677

famsize

.0622453

.0503686

1.24

0.217

-.0364753

.1609659

white

.0391309

.1705625

0.23

0.819

-.2951653

.3734272

excellent

2.91833

.4486757

6.50

0.000

2.038942

3.797719

verygood

2.978336

.4505932

6.61

0.000

2.09519

3.861483

good

2.493939

.4446815

5.61

0.000

1.622379

3.365499

fair

1.460263

.4817231

3.03

0.002

.5161027

2.404422

northeast

.0849125

.2374365

0.36

0.721

-.3804545

.5502796

midwest

.0158816

.2037486

0.08

0.938

-.3834583

.4152215

south

.1750244

.2027444

0.86

0.388

-.2223474

.5723962

centercity

-.2717445

.1911074

-1.42

0.155

-.6463081

.1028192

othermsa

-.0921566

.1929076

-0.48

0.633

-.4702486

.2859354

q1

.422405

.1978767

2.13

0.033

.0345738

.8102362

q2

-.0219499

.2056751

-0.11

0.915

-.4250657

.3811659

q3

-.0365295

.2109049

-0.17

0.862

-.4498954

.3768364

cons

-6.113244

1.427325

-4.28

0.000

-8.910749

-3.315739

3

alc90th

-.1534987

.1395003

-1.10

0.271

-.4269144

.1199169

ue88

-.0954848

.033631

-2.84

0.005

-.1614004

-.0295693

age

.227164

.0409884

5.54

0.000

.1468282

.3074999

agesq

-.0030796

.0004813

-6.40

0.000

-.0040228

-.0021363

schooling

.0890537

.0152314

5.85

0.000

.0592008

.1189067

married

.7085708

.1219565

5.81

0.000

.4695405

.9476012

famsize

.0622447

.0332365

1.87

0.061

-.0028975

.127387

white

.7380044

.1083131

6.81

0.000

.5257147

.9502941

excellent

3.702792

.1852415

19.99

0.000

3.339725

4.065858

verygood

3.653313

.1894137

19.29

0.000

3.282069

4.024557

good

2.99946

.1786747

16.79

0.000

2.649264

3.349656

fair

1.876172

.1885159

9.95

0.000

1.506688

2.245657

northeast

.088966

.1491191

0.60

0.551

-.203302

.3812341

midwest

.1230169

.1294376

0.95

0.342

-.130676

.3767099

south

.4393047

.1298054

3.38

0.001

.1848908

.6937185

centercity

-.2689532

.1231083

-2.18

0.029

-.510241

-.0276654

othermsa

.0978701

.1257623

0.78

0.436

-.1486195

.3443598

q1

-.0274086

.1286695

-0.21

0.831

-.2795961

.224779

q2

-.110751

.126176

-0.88

0.380

-.3580514

.1365494

q3

-.0530835

.1296053

-0.41

0.682

-.3071052

.2009382

cons

-6.237275

.8886698

-7.02

0.000

-7.979036

-4.495515

(y==1 is the base outcome)

and once в/a is estimated, we substitute these estimates in zi and Yi given below (13.51) to get 7j. The second step is to estimate (13.51) using only the positive y*’s with yi substituted for Yi. The resulting estimator of в is consistent and asymptotically normal, see Heckman (1976, 1979).

Alternatively, one can use maximum likelihood procedures to estimate the Tobit model. Note that we have two sets of observations: (i) the positive y* ’s with yi = y*, for which we can write the density function N(х[в, а2), and (ii) the non-positive y* ’ s for which we assign yi = 0 with probability

Pr[yi = 0] = Pr[y* < 0] = Pr[«i < – х’ф] = Ф(-х’ів/а) = 1 – Ф(х[в/а) (13.53)

The probability over the entire censored region gets assigned to the censoring point. This allows us to write the following log-likelihood:

log^ = -(1/2) ЕЩ! log(2^2) – (1/2a2) ЕГ=і(Уі – х’в)2 (13.54)

+ ”=ni+1 log[1 – ф(хів/а)]

Differentiating with respect to в and a2, see Maddala (1983, p. 153), one gets

Подпись: (13.55) (13.56) (13.57) dl°g£/de = ЕГІі(Уі – хів)хі/а2 – E”=„1+i фіхі/а[1 – фі]

dlog£/da2 = YZі(Уі – хів)2/2а4 – (m/2a2) ^™=„i+i фгх’гв/[2а3(1 – Фі)]

where Фі and фі are evaluated at zi = хiв/a.

Premultiplying (13.55) by в’/2a2 and adding the result to (13.56), one gets

aMLE = Е”=і(Уі – хів)Уі/пі = Y((Yi – Хів)/т

where Y1 denotes the n1 x 1 vector of non-zero observations on yi, X1 is the n1 x k matrix of values of хі for the non-zero yi’s. Also, after multiplying throughout by a, (13.55) can be written as:

-X0+ X1 (Yi – Xi0)/a = 0 (13.58)

where Xo denotes the n0 x k matrix of хі’ s for which yi is zero, y0 is an n0 x 1 vector of Yi’ s = фі/[1 – Фі] evaluated at zi = хів/a for the observations for which yi = 0. Solving (13.58) one

gets

Подпись: (13.59)

image571

Pmle = (XiXi)-1 XiYi – a(XiXi)“1X0Yo

Note that the first term in (13.59) is the OLS estimator for the first n observations for which y* is positive.

One can use the Newton-Raphson procedure or the method of scoring, for the second deriva­tives of the log-likelihood, see Maddala (1983, pp. 154-156). These can be computed with the tobit command in Stata. Note that for the Tobit specification, both в and a2 are identified. This is contrasted to the logit and probit specifications where only the ratio (в/a2) is identified. Wooldridge (2009, Chapter 17) recommends one obtain the estimates of (в/a2) from a probit and comparing those with the Tobit estimates generated by dividing /3 by 32. If these estimates are different or have different signs, then the Tobit estimation may not be appropriate. Problem 13.17 illustrates the Tobit estimation for married women labor supply example using the Mroz (1987) data.

Maddala warns that the Tobit specification is not necessarily the right specification every time we have zero observations. It is applicable only in those cases where the latent variable can, in principle, take negative values and the observed zero values are a consequence of censoring and non-observability. In fact, one cannot have negative expenditures on a car, negative hours of work or negative wages. However, one can enter employment and earn wages when one’s observed wage is larger than the reservation wage. Let y* be the difference between observed wage and reservation wage. Only if y* is positive will wages be observed. Final warning: The Tobit specification is heavily reliant on the normality and homoskedasticity assumptions. Failure of these assumptions leads to misleading inference.

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>