Limited Dependent Variables

13.1 The Linear Probability Model

Уі

u;

Prob.

1

1 — x0"

к;

0

CD.

УС

1

1 — к;

a. Let к і = Pr[y; = 1], then y; = 1 when u; = 1 — x0" with probability к; as shown in the table above. Similarly, y; = 0 when u; = —x0" with probability 1 — к ;. Hence, E(u;) = к; (1 — x[") + (1 — к 😉 (—x0").

For this to equal zero, we get, к; — к ;xi" + к ;xi" — x0" = 0 which gives к ; = xi" as required.

b. var(u;) = E(u2) = (1 — xi")2 к ; + (—x0")2 (1 — к 😉

1 — 2×0" + (x0")2 к; + (xi")2 (1 — к i)

= к ; — 2×0" к ; + (x0")2 = к ; — к 2 = к ;(1 — к 😉 = x0" (1 — xi") using the fact that к ; = xi".

13.2 a. Since there are no slopes and only a constant, x0" = a and (13.16) becomes

n

log ‘ = J]{y; logF(a) + (1 — y;) log[1 — F(a)]} differentiating with respect

i=1

to a we get

9log’ y; л (1 — y;) (

= £ щ •f(a)+£г—щ (-f(a».

n

Setting this equal to zero yields J2 (yi — F(a))f(a) = 0.

i=1

n

Therefore, F(a) = J2 Уі/п = y. This is the proportion of the sample with

i=1

Уі = 1

B. H. Baltagi, Solutions Manual for Econometrics, Springer Texts in Business and Economics, DOI 10.1007/978-3-642-54548-1_13, © Springer-Verlag Berlin Heidelberg 2015

b. Using F(a) = y, the value of the maximized likelihood, from (13.16), is

n

log’r =2>logy C (1 Уі)log(l-y)} = nylogy C (n—ny)log(l-y)

i=i

= n[y log y C (1 — y) log(1 — y)] as required.

c. For the empirical example in Sect. 13.9, we know that y = 218/595 = 0.366. Substituting in (13.33) we get, log’r = n[0.366 log0.366 C (1 — 0.366) log(1 — 0.366)] = —390.918.

13.3 Union participation example. See Tables 13.3-13.5. These were run using EViews.

a. OLS ESTIMATION

LS // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

1.195872

0.227010

5.267922

0.0000

EX

-0.001974

0.001726

-1.143270

0.2534

WKS

-0.017809

0.003419

-5.209226

0.0000

OCC

0.318118

0.046425

6.852287

0.0000

IND

0.030048

0.038072

0.789229

0.4303

SOUTH

-0.170130

0.039801

-4.274471

0.0000

SMSA

0.084522

0.038464

2.197419

0.0284

MS

0.098953

0.063781

1.551453

0.1213

FEM

-0.108706

0.079266

-1.371398

0.1708

ED

-0.016187

0.008592

-1.883924

0.0601

BLK

0.050197

0.071130

0.705708

0.4807

R-squared

0.233548

Mean dependent var

0.366387

Adjusted R-squared

0.220424

S. D. dependent var

0.482222

S. E. of regression

0.425771

Akaike info criterion

-1.689391

Sum squared resid

105.8682

Schwarz criterion

-1.608258

Log likelihood

-330.6745

F-statistic

17.79528

Durbin-Watson stat

1.900963

Prob(F-statistic)

0.000000

LOGIT ESTIMATION

LOGIT // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Convergence achieved after 4 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

4.380828

1.338629

3.272624

0.0011

EX

-0.011143

0.009691

-1.149750

0.2507

WKS

-0.108126

0.021428

-5.046037

0.0000

OCC

1.658222

0.264456

6.270325

0.0000

IND

0.181818

0.205470

0.884888

0.3766

SOUTH

-1.044332

0.241107

-4.331411

0.0000

SMSA

0.448389

0.218289

2.054110

0.0404

MS

0.604999

0.365043

1.657336

0.0980

FEM

-0.772222

0.489665

-1.577040

0.1153

ED

-0.090799

0.049227

-1.844501

0.0656

BLK

0.355706

0.394794

0.900992

0.3680

Log likelihood

-312.3367

Obs with Dep=1

218

Obs with Dep=0

377

Variable

Mean All

Mean D=1

Mean D=0

C

1.000000

1.000000

1.000000

EX

22.85378

23.83028

22.28912

WKS

46.45210

45.27982

47.12997

OCC

0.512605

0.766055

0.366048

IND

0.405042

0.513761

0.342175

SOUTH

0.292437

0.197248

0.347480

SMSA

0.642017

0.646789

0.639257

MS

0.805042

0.866972

0.769231

FEM

0.112605

0.059633

0.143236

ED

12.84538

11.84862

13.42175

BLK

0.072269

0.082569

0.066313

PROBIT ESTIMATION

PROBIT // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Convergence achieved after 3 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

2.516784

0.762606

3.300242

0.0010

EX

-0.006932

0.005745

-1.206501

0.2281

WKS

-0.060829

0.011785

-5.161707

0.0000

OCC

0.955490

0.152136

6.280522

0.0000

IND

0.092827

0.122773

0.756089

0.4499

SOUTH

-0.592739

0.139100

-4.261243

0.0000

SMSA

0.260701

0.128629

2.026756

0.0431

MS

0.350520

0.216282

1.620664

0.1056

FEM

-0.407026

0.277034

-1.469226

0.1423

ED

-0.057382

0.028842

-1.989533

0.0471

BLK

0.226482

0.228843

0.989683

0.3227

Log likelihood -313.3795 ObswithDep=1 218 Obs with Dep=0 377

d. Dropping the industry variable (IND).

OLS ESTIMATION

LS // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

1.216753

0.225390

5.398425

0.0000

EX

-0.001848

0.001718

-1.075209

0.2827

WKS

-0.017874

0.003417

-5.231558

0.0000

OCC

0.322215

0.046119

6.986568

0.0000

SOUTH

-0.173339

0.039580

-4.379418

0.0000

SMSA

0.085043

0.038446

2.212014

0.0274

MS

0.100697

0.063722

1.580267

0.1146

FEM

-0.114088

0.078947

-1.445122

0.1490

ED

-0.017021

0.008524

-1.996684

0.0463

BLK

0.048167

0.071061

0.677822

0.4982

R-squared

0.232731

Mean dependent var

0.366387

Adjusted R-squared

0.220927

S. D. dependent var

0.482222

S. E. of regression

0.425634

Akaike info criterion

-1.691687

Sum squared resid

105.9811

Schwarz criterion

-1.617929

Log likelihood

-330.9916

F-statistic

19.71604

Durbin-Watson stat

1.907714

Prob(F-statistic)

0.000000

LOGIT ESTIMATION

LOGIT // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Convergence achieved after 4 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

4.492957

1.333992

3.368053

0.0008

EX

-0.010454

0.009649

-1.083430

0.2791

WKS

-0.107912

0.021380

-5.047345

0.0000

OCC

1.675169

0.263654

6.353652

0.0000

SOUTH

-1.058953

0.240224

-4.408193

0.0000

SMSA

0.449003

0.217955

2.060074

0.0398

MS

0.618511

0.365637

1.691599

0.0913

FEM

-0.795607

0.489820

-1.624285

0.1049

ED

-0.096695

0.048806

-1.981194

0.0480

BLK

0.339984

0.394027

0.862845

0.3886

Log likelihood

-312.7267

Obs with Dep=1

218

Obs with Dep=0

377

Variable

Mean All

Mean D=1

Mean D=0

C

1.000000

1.000000

1.000000

EX

22.85378

23.83028

22.28912

WKS

46.45210

45.27982

47.12997

OCC

0.512605

0.766055

0.366048

SOUTH

0.292437

0.197248

0.347480

SMSA

0.642017

0.646789

0.639257

MS

0.805042

0.866972

0.769231

FEM

0.112605

0.059633

0.143236

ED

12.84538

11.84862

13.42175

BLK

0.072269

0.082569

0.066313

PROBIT ESTIMATION

PROBIT // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Convergence achieved after 3 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

2.570491

0.759181

3.385875

0.0008

EX

-0.006590

0.005723

-1.151333

0.2501

WKS

-0.060795

0.011777

-5.162354

0.0000

OCC

0.967972

0.151305

6.397481

0.0000

SOUTH

-0.601050

0.138528

-4.338836

0.0000

SMSA

0.261381

0.128465

2.034640

0.0423

MS

0.357808

0.216057

1.656085

0.0982

FEM

-0.417974

0.276501

-1.511657

0.1312

ED

-0.060082

0.028625

-2.098957

0.0362

BLK

0.220695

0.228363

0.966423

0.3342

Log likelihood -313.6647 ObswithDep = 1 218 Obs with Dep = 0 377

f. The restricted regressions omitting IND, FEM and BLK are given below:

LS // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

1.153900

0.218771

5.274452

0.0000

EX

-0.001840

0.001717

-1.071655

0.2843

WKS

-0.017744

0.003412

-5.200421

0.0000

OCC

0.326411

0.046051

7.088110

0.0000

SOUTH

-0.171713

0.039295

-4.369868

0.0000

SMSA

0.086076

0.038013

2.264382

0.0239

MS

0.158303

0.045433

3.484351

0.0005

ED

-0.017204

0.008507

-2.022449

0.0436

R-squared

0.229543

Mean dependent var

0.366387

Adjusted R-squared

0.220355

S. D. dependent var

0.482222

S. E. of regression

0.425790

Akaike info criterion

-1.694263

Sum squared resid

106.4215

Schwarz criterion

-1.635257

Log likelihood

-332.2252

F-statistic

24.98361

Durbin-Watson stat

1.912059

Prob(F-statistic)

0.000000

LOGIT// Dependent Variable is UNION Sample: 1 595 Included observations: 595 Convergence achieved after 4 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

4.152595

1.288390

3.223088

0.0013

EX

-0.011018

0.009641

-1.142863

0.2536

WKS

-0.107116

0.021215

-5.049031

0.0000

OCC

1.684082

0.262193

6.423059

0.0000

SOUTH

-1.043629

0.237769

-4.389255

0.0000

SMSA

0.459707

0.215149

2.136687

0.0330

MS

0.975711

0.272560

3.579800

0.0004

ED

-0.100033

0.048507

-2.062229

0.0396

Log likelihood

-314.2744

Obs with Dep=1

218

Obs with Dep=0

377

Variable

Mean All

Mean D=

1 Mean D=

0

C

1.000000

1.000000

1.000000

EX

22.85378

23.83028

22.28912

WKS

46.45210

45.27982

47.12997

OCC

0.512605

0.766055

0.366048

SOUTH

0.292437

0.197248

0.347480

SMSA

0.642017

0.646789

0.639257

MS

0.805042

0.866972

0.769231

ED

12.84538

11.84862

13.42175

PROBIT // Dependent Variable is UNION

Sample: 1 595

Included observations: 595

Convergence achieved after 3 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

2.411706

0.741327

3.253228

0.0012

EX

-0.006986

0.005715

-1.222444

0.2220

WKS

-0.060491

0.011788

-5.131568

0.0000

OCC

0.971984

0.150538

6.456745

0.0000

SOUTH

-0.580959

0.136344

-4.260988

0.0000

SMSA

0.273201

0.126988

2.151388

0.0319

MS

0.545824

0.155812

3.503105

0.0005

ED

-0.063196

0.028464

-2.220210

0.0268

Log likelihood

-315.1770

Obs with Dep=1

218

Obs with Dep=0

377

13.4 Occupation regression.

a. OLS Estimation

LS // Dependent Variable is OCC

Sample: 1 595

Included observations: 595

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

2.111943

0.182340

11.58245

0.0000

ED

-0.111499

0.006108

-18.25569

0.0000

WKS

-0.001510

0.003044

-0.496158

0.6200

EX

-0.002870

0.001533

-1.872517

0.0616

SOUTH

-0.068631

0.035332

-1.942452

0.0526

SMSA

-0.079735

0.034096

-2.338528

0.0197

IND

0.091688

0.033693

2.721240

0.0067

MS

0.006271

0.056801

0.110402

0.9121

FEM

-0.064045

0.070543

-0.907893

0.3643

BLK

0.068514

0.063283

1.082647

0.2794

R-squared

0.434196 Mean dependent var

0.512605

Adjusted R-squared

0.425491 S. D. dependent var

0.500262

S. E. of regression

0.379180 Akaike info criterion

-1.922824

Sum squared resid

84.10987 Schwarz criterion

-1.849067

Log likelihood

-262.2283 F-statistic

49.88075

Durbin-Watson stat

1.876105 Prob(F-statistic)

0.000000

LOGIT ESTIMATION

LOGIT // Dependent Variable is OCC

Sample: 1 595

Included observations: 595

Convergence achieved after 5 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

11.62962

1.581601

7.353069

0.0000

ED

-0.806320

0.070068

-11.50773

0.0000

WKS

-0.008424

0.023511

-0.358297

0.7203

EX

-0.017610

0.011161

-1.577893

0.1151

SOUTH

-0.349960

0.260761

-1.342073

0.1801

SMSA

-0.601945

0.247206

-2.434995

0.0152

IND

0.689620

0.241028

2.861157

0.0044

PROBIT ESTIMATION

PROBIT // Dependent Variable is OCC

Sample: 1 595

Included observations: 595

Convergence achieved after 4 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

6.416131

0.847427

7.571312

0.0000

ED

-0.446740

0.034458

-12.96473

0.0000

WKS

-0.003574

0.013258

-0.269581

0.7876

EX

-0.010891

0.006336

-1.718878

0.0862

SOUTH

-0.240756

0.147920

-1.627608

0.1041

SMSA

-0.327948

0.139849

-2.345016

0.0194

IND

0.371434

0.135825

2.734658

0.0064

MS

-0.097665

0.245069

-0.398522

0.6904

FEM

-0.358948

0.296971

-1.208697

0.2273

BLK

0.215257

0.252219

0.853453

0.3938

Log likelihood -246.6581 Obs with Dep=1 305 Obs with Dep=0 290

13.5 Truncated Uniform Density.

1 ■

Ї1 1

1

"3"

Pr

x > —

= – dx =

2

У-1/2 2

2

_2_

= – .So that 4

f x/x > — =

f(x)

1/2 2 1

і — Г, = —— = – tor—————– < x <1.

2) Pr[x > – i] 3/4 3 2

var(x) = E(x2) – (E(x))2 = E(x2) = 3

E(x2/x >- д=/-1/2×2 – 2 – dx=2 – 3[x3]L1/2=2

Therefore, as expected, truncation reduces the variance.

13.6 Truncated Normal Density.

a. From the Appendix, Eq. (A.1), using c = 1, p = 1, ct2 = 1 and Ф(0) = 2, we get, f(x/x >1) = ¥-x-(0}) = 2¥(x – 1) for x >1

Similarly, using Eq. (A.2), for c = 1, p = 1 and ct2 = 1 with Ф(0) = 3 we getf(x/x < 1) = ¥ф(0)1) = 2¥(x – 1) forx < 1

b. The conditional mean is given in (A.3) and for this example we get

c___ ^

with c* = = = 0. Similarly, using (A.4) we get,

о 1

¥(c*) ф(0) 2

E(x/x < 1) = 1 – 1 • ) = 1 – = 1 – 2ф(0) = 1 –

c. From (A.5) we get, var(x/x >1) = 1(1 – 8(c*)) = 1 – 8(0) where

2 4 2

= 2ф(0)[2ф(0)] = 4ф2(0) = = = 0.64 for x >1

2

From (A.6), we get var(x/x >1) = 1 – 8(0) where

Both conditional truncated variances are less than the unconditional var(x) = 1 .

13.7 Censored Normal Distribution.

a. From the Appendix we get,

E(y) = Pr[y = c] E(y/y = c) C Pr[y > c] E(y/y > c)

= cФ(c*) C (1 – Ф(c*))E(y*/y* > c)

Ф(c*)

1 – Ф(c*)_

where E(y*/y* > c) is obtained from the mean of a truncated normal density, see (A.3).

b. Using the result on conditional variance given in Chap. 2 we get, var(y) = E(conditional variance) C var(conditional mean). But

E(conditional variance) = P[y = c] var(y/y = c)CP[y > c] var(y/y > c)

= Ф(е*) • 0 + (1 – Ф(о*Х)ст2(1 – 8(c*)) where var(y/y > c) is given by (A.5).

var(conditional mean) = P[y = c] • (c — E(y))2 + Pr(y > c)[E(y/y>c)-E(y)]2 = Ф(c*)(c — E(y))2+[1 — Ф(^)][E(y/y > c) — E(y)]2

where E(y) is given by (A.7) and E(y/y > c) is given by (A.3). This gives

var(conditional mean) = Ф(^) fc — cФ(c*) — (1 — Ф^*))

as required. Similarly, from part (b), using c* = —fi/o and Ф(— fi/o) =

13.8 Fixed vs. adjustable mortgage rates. This is based on Dhillon et al. (1987).

a. The OLS regression of Y on all variables in the data set is given below. This was done using EViews. The R2 = 0.434 and the F-statistic for the significance of all slopes is equal to 3.169. This is distributed as F(15,62) under the null hypothesis. This has a p-value of 0.0007. Therefore, we reject Ho and we conclude that this is a significant regression. As explained in Sect. 13.6, using BRMR this also rejects the insignificance of all slopes in the logit specification.

Unrestricted Least Squares

LS // Dependent Variable is Y

Sample: 1 78

Included observations: 78

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

1.272832

1.411806

0.901563

0.3708

BA

0.000398

0.007307

0.054431

0.9568

BS

0.017084

0.020365

0.838887

0.4048

NW

-0.036932

0.025320

-1.458609

0.1497

FI

-0.221726

0.092813

-2.388949

0.0200

PTS

0.178963

0.091050

1.965544

0.0538

MAT

0.214264

0.202497

1.058108

0.2941

MOB

0.020963

0.009194

2.279984

0.0261

MC

0.189973

0.150816

1.259635

0.2125

FTB

-0.013857

0.136127

-0.101797

0.9192

SE

0.188284

0.360196

0.522728

0.6030

YLD

0.656227

0.366117

1.792399

0.0779

MARG

0.129127

0.054840

2.354621

0.0217

CB

0.172202

0.137827

1.249403

0.2162

STL

-0.001599

0.005994

-0.266823

0.7905

LA

-0.001761

0.007801

-0.225725

0.8222

R-squared

0.433996

Mean dependent var

0.589744

Adjusted R-squared

0.297059

S. D. dependent var

0.495064

S. E. of regression

0.415069

Akaike info criter

-1.577938

Sum squared resid

10.68152

Schwarz criterion

-1.094510

Log likelihood

-33.13764

F-statistic

3.169321

Durbin-Watson stat

0.905968

Prob(F-statistic)

0.000702

Plot of Y and YHAT

b. The URSS from part (a) is 10.6815 while the RRSS by including only the cost variables is 14.0180 as shown in the enclosed output from EViews. The Chow-F statistic for insignificance of 10 personal characteristics variables is

F= (14.0180 – 10.6815)/10

10.6815/62 ‘

which is distributed as F(10,62) under the null hypothesis. This has a 5% critical value of 1.99. Hence, we cannot reject Ho. The principal agent theory suggests that personal characteristics are important in making this mortgage choice. Briefly, this theory suggests that information is asym­metric and the borrower knows things about himself or herself that the lending institution does not. Not rejecting Ho does not provide support for the principal agent theory.

TESTING THE EFFICIENT MARKET HYPOTHESIS WITH THE LINEAR PROBABILITY MODEL

Restricted Least Squares

LS // Dependent Variable is Y

Sample: 1 78

Included observations: 78

Variable

Coefficient

Std. Error

t-Statistic

Prob.

FI

-0.237228

0.078592

-3.018479

0.0035

MARG

0.127029

0.051496

2.466784

0.0160

YLD

0.889908

0.332037

2.680151

0.0091

PTS

0.054879

0.072165

0.760465

0.4495

MAT

0.069466

0.196727

0.353108

0.7250

C

1.856435

1.289797

1.439324

0.1544

R-squared

0.257199

Mean dependent var

0.589744

Adjusted R-squared

0.205616

S. D. dependent var

0.495064

S. E. of regression

0.441242

Akaike info criter

-1.562522

Sum squared resid

14.01798

Schwarz criterion

-1.381236

Log likelihood

-43.73886

F-statistic

4.986087

Durbin-Watson stat

0.509361

Prob(F-statistic)

0.000562

c. The logit specification output using EViews is given below. The unre­stricted log-likelihood is equal to —30.8963. The restricted specification output is also given showing a restricted log-likelihood of —41.4729. Therefore, the LR test statistic is given by LR = 2(41.4729 — 30.8963/ = 21.1532 which is distributed as x20 under the null hypothesis. This is sig­nificant given that the 5% critical value of x20 is 18.31. This means that the logit specification does not reject the principal agent theory as personal characteristics are not jointly insignificant.

TESTING THE EFFICIENT MARKET HYPOTHESIS WITH THE LOGIT MODEL Unrestricted Logit Model

LOGIT // Dependent Variable is Y Sample: 1 78 Included observations: 78 Convergence achieved after 5 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

C

4.238872

10.47875

0.404521

0.6872

BA

0.010478

0.075692

0.138425

0.8904

BS

0.198251

0.172444

1.149658

0.2547

NW

-0.244064

0.185027

-1.319072

0.1920

FI

-1.717497

0.727707

-2.360149

0.0214

PTS

1.499799

0.719917

2.083294

0.0414

MAT

2.057067

1.631100

1.261153

0.2120

MOB

0.153078

0.097000

1.578129

0.1196

MC

1.922943

1.182932

1.625575

0.1091

FTB

-0.110924

0.983688

-0.112763

0.9106

SE

2.208505

2.800907

0.788496

0.4334

YLD

4.626702

2.919634

1.584686

0.1181

MARG

1.189518

0.485433

2.450426

0.0171

CB

1.759744

1.242104

1.416744

0.1616

STL

-0.031563

0.051720

-0.610265

0.5439

LA

-0.022067

0.061013

-0.361675

0.7188

Log likelihood -30.89597 Obs with Dep=1 46

Obs with Dep=0 32

Variable

Mean All

Mean D=1

Mean D=0

C

1.000000

1.000000

1.000000

BA

36.03846

35.52174

36.78125

BS

16.44872

15.58696

17.68750

NW

3.504013

2.075261

5.557844

FI

13.24936

13.02348

13.57406

PTS

1.497949

1.505217

1.487500

MAT

1.058333

1.027609

1.102500

MOB

4.205128

4.913043

3.187500

MC

0.602564

0.695652

0.468750

FTB

0.615385

0.521739

0.750000

SE

0.102564

0.043478

0.187500

YLD

1.606410

1.633261

1.567813

MARG

2.291923

2.526304

1.955000

CB

0.358974

0.478261

0.187500

STL

13.42218

11.72304

15.86469

LA

5.682692

4.792174

6.962812

Restricted Logit Model

LOGIT // Dependent Variable is Y

Sample: 1 78

Included observations: 78

Convergence achieved after 4 iterations

Variable

Coefficient

Std. Error t-Statistic Prob.

FI

-1.264608

0.454050 -2.785172 0.0068

MARG

0.717847

0.313845 2.287265 0.0251

YLD

4.827537

1.958833 2.464497 0.0161

PTS

0.359033

0.423378 0.848019 0.3992

MAT

0.550320

1.036613 0.530883 0.5971

C

6.731755

7.059485 0.953576 0.3435

Log likelihood

-41.47292

Obs with Dep=1

46

Obs with Dep=0

32

Variable

Mean All

Mean D=1

Mean D=0

FI

13.24936

13.02348

13.57406

MARG

2.291923

2.526304

1.955000

YLD

1.606410

1.633261

1.567813

PTS

1.497949

1.505217

1.487500

MAT

1.058333

1.027609

1.102500

C

1.000000

1.000000

1.000000

d. Similarly, the probit specification output using EViews is given below. The unrestricted log-likelihood is equal to —30.7294. The restricted log – likelihood is —41.7649. Therefore, the LR test statistic is given by LR = 2(41.7649 — 30.7294/ = 22.0710 which is distributed as x?0 under the null hypothesis. This is significant given that the 5% critical value of x20 is 18.31. This means that the probit specification does not reject the principal agent theory as personal characteristics are not jointly insignificant.

TESTING THE EFFICIENT MARKET HYPOTHESIS WITH THE PROBIT MODEL

Unrestricted Probit Model

PROBIT // Dependent Variable is Y Sample: 1 78 Included observations: 78 Convergence achieved after 5 iterations

Variable

Coefficien

Std. Error

t-Statistic

Prob.

C

3.107820

5.954673

0.521913

0.6036

BA

0.003978

0.044546

0.089293

0.9291

BS

0.108267

0.099172

1.091704

0.2792

NW

-0.128775

0.103438

-1.244943

0.2178

FI

-1.008080

0.418160

-2.410750

0.0189

PTS

0.830273

0.379895

2.185533

0.0326

MAT

1.164384

0.924018

1.260131

0.2123

MOB

0.093034

0.056047

1.659924

0.1020

MC

1.058577

0.653234

1.620518

0.1102

FTB

-0.143447

0.550471

-0.260589

0.7953

SE

1.127523

1.565488

0.720237

0.4741

YLD

2.525122

1.590796

1.587332

0.1175

MARG

0.705238

0.276340

2.552069

0.0132

CB

1.066589

0.721403

1.478493

0.1443

STL

-0.016130

0.029303

-0.550446

0.5840

LA

-0.014615

0.035920

-0.406871

0.6855

Log likelihood -30.72937 Obs with Dep=1 46

Obs with Dep=0 32

Restricted Probit Model

PROBIT // Dependent Variable is Y Sample: 1 78 Included observations: 78 Convergence achieved after 3 iterations

Variable

Coefficient

Std. Error

t-Statistic

Prob.

FI

-0.693584

0.244631

-2.835225

0.0059

MARG

0.419997

0.175012

2.399811

0.0190

YLD

2.730187

1.099487

2.483146

0.0154

PTS

0.235534

0.247390

0.952076

0.3442

MAT

0.221568

0.610572

0.362886

0.7178

C

3.536657

4.030251

0.877528

0.3831

Log likelihood -41.76443 Obs with Dep=1 46

Obs with Dep=0 32

13.13 Problem Drinking and Employment. The following Stata output replicates the OLS results given in Table 5 of Mullahy and Sindelar (1996, p. 428) for males. The first regression is for employment, given in column 1 of Table 5 of the paper, and the second regression is for unemployment, given in column 3 of Table 5 of the paper. Robust standard errors are reported.

. reg emp hvdrnk90 ue88 age agesq educ married famsize white hlstat1 hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Regression with robust standard errors

Number of obs

= 9822

F (20, 9801)

= 46.15

Prob > F

= 0.0000

R-squared

= 0.1563

Root MSE

= .27807

Robust

emp |

Coef.

Std. Err.

t

P>|t|

[95% Conf. Interval]

hvdrnk90 |

-.0155071

.0101891

-1.52

0.128

-.0354798

.0044657

ue88 |

-.0090938

.0022494

-4.04

0.000

-.013503

-.0046846

age |

.0162668

.0029248

5.56

0.000

.0105336

.0220001

agesq |

-.0002164

.0000362

-5.98

0.000

-.0002873

-.0001455

educ

.0078258

.0011271

6.94

0.000

.0056166

.0100351

married

.0505682

.0098396

5.14

0.000

.0312805

.0698558

famsize

.0020612

.0021796

0.95

0.344

-.0022113

.0063336

white

.0773332

.0104289

7.42

0.000

.0568905

.097776

hlstat1

.5751898

.0306635

18.76

0.000

.5150831

.6352965

hlstat2

.5728

.0306427

18.69

0.000

.512734

.632866

hlstat3

.537617

.0308845

17.41

0.000

.4770769

.598157

hlstat4

.3947391

.0354291

11.14

0.000

.3252908

.4641874

region1

-.0013608

.0094193

-0.14

0.885

-.0198247

.017103

region2

.0050446

.0084215

0.60

0.549

-.0114633

.0215526

region3

.0254332

.0081999

3.10

0.002

.0093596

.0415067

msa1

-.0159492

.0083578

-1.91

0.056

-.0323322

.0004337

msa2

.0073081

.0072395

1.01

0.313

-.0068827

.0214989

q1

-.0155891

.0079415

-1.96

0.050

-.0311561

-.000022

q2

-.0068915

.0077786

-0.89

0.376

-.0221392

.0083561

q3

-.0035867

.0078474

-0.46

0.648

-.0189692

.0117957

_cons

-.0957667

.0623045

-1.54

0.124

-.2178964

.0263631

. reg unemp hvdrnk90 ue88 age agesq educ married famsize white hlstatl hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Regression with robust standard errors

Number of obs

= 9822

F(20, 9801)

= 3.37

Prob > F

= 0.0000

R-squared

= 0.0099

Root MSE

= .17577

1

emp |

Coef.

Robust Std. Err.

t

P>|t|

[95% Conf. Interval]

hvdrnk90 |

.0100022

.0066807

1.50

0.134

-.0030934

.0230977

ue88 |

.0045029

.0014666

3.07

0.002

.0016281

.0073776

age |

-.0014753

.0017288

-0.85

0.393

-.0048641

.0019134

agesq |

.0000123

.0000206

0.60

0.551

-.0000281

.0000527

educ |

-.0028141

.0006307

-4.46

0.000

-.0040504

-.0015777

married |

-.0092854

.0060161

-1.54

0.123

-.0210782

.0025073

famsize |

.0003859

.0013719

0.28

0.778

-.0023033

.0030751

white |

-.0246801

.0063618

-3.88

0.000

-.0371506

-.0122096

hlstat1 j

.0150194

.0113968

1.32

0.188

-.0073206

.0373594

hlstat2 |

.0178594

.0114626

1.56

0.119

-.0046097

.0403285

hlstat3 |

.0225153

.0116518

1.93

0.053

-.0003245

.0453552

hlstat4 |

.0178865

.0136228

1.31

0.189

-.0088171

.0445901

region1 |

.0007911

.005861

0.13

0.893

-.0106977

.01228

region2 |

-.0029056

.0053543

-0.54

0.587

-.0134011

.0075898

region3 |

-.0065005

.005095

-1.28

0.202

-.0164877

.0034868

msa1 |

-.0008801

.0052004

-0.17

0.866

-.011074

.0093139

msa2 I

-.0055184

.0047189

-1.17

0.242

-.0147685

.0037317

q11

.0145704

.0051986

2.80

0.005

.00438

.0247607

q21

.0022831

.0047579

0.48

0.631

-.0070434

.0116096

q31

.000043

.0047504

0.01

0.993

-.0092687

.0093547

.cons j

.0927746

.0364578

2.54

0.011

.0213098

.1642394

The following Stata output replicates the OLS results given in Table 6 of Mullahy and Sindelar (1996, p. 429) for females. The first regression is for employment, given in column 1 of Table 6 of the paper, and the second regres­sion is for unemployment, given in column 3 of Table 6 of the paper. Robust standard errors are reported.

. reg emp hvdrnk90 ue88 age agesq educ married famsize white hlstatl hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Regression with robust standard errors

Number of obs

= 12534

F(20,12513)

= 117.99

Prob > F

= 0.0000

R-squared

= 0.1358

Root MSE

= .42932

j

emp j

Coef.

Robust Std. Err.

t

P>jtj

[95% Conf. Interval]

hvdrnk90 j

.0059878

.0120102

0.50

0.618

-.017554

.0295296

ue88 j

-.0168969

.002911

-5.80

0.000

-.0226028

-.011191

age j

.04635

.0036794

12.60

0.000

.0391378

.0535622

agesq j

-.0005898

.0000449

-13.13

0.000

-.0006778

-.0005018

educ j

.0227162

.0015509

14.65

0.000

.0196762

.0257563

married j

.0105416

.0111463

0.95

0.344

-.0113068

.0323901

famsize j

-.0662794

.0030445

-21.77

0.000

-.072247

-.0603118

white j

-.0077594

.0104111

-0.75

0.456

-.0281668

.012648

hvdrnk90 j

.0059878

.0120102

0.50

0.618

-.017554

.0295296

ue88 j

-.0168969

.002911

-5.80

0.000

-.0226028

-.011191

age j

.04635

.0036794

12.60

0.000

.0391378

.0535622

agesq j

-.0005898

.0000449

-13.13

0.000

-.0006778

-.0005018

educ j

.0227162

.0015509

14.65

0.000

.0196762

.0257563

married j

.0105416

.0111463

0.95

0.344

-.0113068

.0323901

famsize j

-.0662794

.0030445

-21.77

0.000

-.072247

-.0603118

white j

-.0077594

.0104111

-0.75

0.456

-.0281668

.012648

hlstat1 j

.4601695

.0253797

18.13

0.000

.4104214

.5099177

hlstat2 j

.4583823

.0252973

18.12

0.000

.4087957

.5079689

hlstat3

.4096243

.0251983

16.26

0.000

.3602317

.4590169

hlstat4

.2494427

.027846

8.96

0.000

.1948602

.3040251

region1

-.0180596

.0129489

-1.39

0.163

-.0434415

.0073223

region2

.0095951

.0114397

0.84

0.402

-.0128285

.0320186

region3

.0465464

.0108841

4.28

0.000

.0252119

.067881

msa1

-.0256183

.0109856

-2.33

0.020

-.0471518

-.0040848

msa2

.0051885

.0103385

0.50

0.616

-.0150765

.0254534

q1

-.0058134

.0107234

-0.54

0.588

-.0268329

.0152061

q2

-.0061301

.0109033

-0.56

0.574

-.0275022

.0152421

q3

-.0168673

.0109023

-1.55

0.122

-.0382376

.0045029

_cons

-.5882924

.0782545

-7.52

0.000

-.7416831

-.4349017

. reg unemp hvdrnk90 ue88 age agesq educ married famsize white hlstatl hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Regression with robust standard errors Number of obs = 12534

F(20, 12513) = 5.99

Prob > F = 0.0000

R-squared = 0.0141

Root MSE = .18409

1

emp |

Coef.

Robust Std. Err.

t

P>|t|

[95% Conf. Interval]

hvdrnk90 |

.0149286

.0059782

2.50

0.013

.0032104

.0266468

ue88 |

.0038119

.0013782

2.77

0.006

.0011105

.0065133

age |

-.0013974

.0015439

-0.91

0.365

-.0044237

.0016289

agesq |

4.43e-06

.0000181

0.24

0.807

-.0000311

.00004

educ |

-.0011631

.0006751

-1.72

0.085

-.0024865

.0001602

married |

-.0066296

.0058847

-1.13

0.260

-.0181645

.0049053

famsize |

.0013304

.0013075

1.02

0.309

-.0012325

.0038933

white |

-.0308826

.0051866

-5.95

0.000

-.0410493

-.020716

hlstat1 j

.008861

.0092209

0.96

0.337

-.0092135

.0269354

hlstat2 |

.0079536

.0091305

0.87

0.384

-.0099435

.0258507

hlstat3 |

.0224927

.0093356

2.41

0.016

.0041934

.0407919

hlstat4 |

.0193116

.0106953

1.81

0.071

-.0016528

.040276

region1 |

.0020325

.0055618

0.37

0.715

-.0088694

.0129344

region2 |

-.0005405

.0049211

-0.11

0.913

-.0101866

.0091057

region3 |

-.0079708

.0046818

-1.70

0.089

-.0171479

.0012063

msa1 |

-.002055

.0049721

-0.41

0.679

-.0118011

.007691

msa2 |

-.0130041

.0041938

-3.10

0.002

-.0212246

-.0047835

q1 |

.0025441

.0043698

0.58

0.560

-.0060214

.0111095

q21

.0080984

.0046198

1.75

0.080

-.0009571

.0171539

q31

.0102601

.0046839

2.19

0.029

.001079

.0194413

_cons |

.0922081

.0350856

2.63

0.009

.023435

.1609813

The corresponding probit equation for employment for males is given by the following stata output (this replicates Table 13.6 in the text): . probit emp hvdrnk90 ue88 age agesq educ married famsize white hlstatl hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Probit regression

Number of obs

= 9822

Wald chi2(20)

= 928.34

Prob > chi2

= 0.0000

Log pseudolikelihood = -2698.1797

Pseudo R2

= 0.1651

1

emp |

Coef.

Robust Std. Err.

z

P>|z|

[95% Conf.

Interval]

hvdrnk90 |

-.1049465

.0589878

-1.78

0.075

-.2205606

.0106675

ue88 |

-.0532774

.0142024

-3.75

0.000

-.0811135

-.0254413

age |

.0996338

.0171184

5.82

0.000

.0660824

.1331853

agesq |

-.0013043

.0002051

-6.36

0.000

-.0017062

-.0009023

educ |

.0471834

.0066738

7.07

0.000

.034103

.0602638

married |

.2952921

.0540855

5.46

0.000

.1892866

.4012976

famsize |

.0188906

.0140462

1.34

0.179

-.0086395

.0464206

white |

.3945226

.0483378

8.16

0.000

.2997822

.489263

hlstat1 |

1.816306

.0983443

18.47

0.000

1.623554

2.009057

hlstat2 |

1.778434

.0991528

17.94

0.000

1.584098

1.97277

hlstat3 |

1.547836

.0982635

15.75

0.000

1.355244

1.740429

hlstat4 |

1.043363

.1077276

9.69

0.000

.8322209

1.254505

region1 |

.0343123

.0620016

0.55

0.580

-.0872085

.1558331

region2 |

.0604907

.0537881

1.12

0.261

-.044932

.1659135

region3 |

.1821206

.0542342

3.36

0.001

.0758236

.2884176

msa1 |

-.0730529

.0518715

-1.41

0.159

-.1747192

.0286134

msa2 |

.0759533

.0513087

1.48

0.139

-.02461

.1765166

q11

-.1054844

.0527723

-2.00

0.046

-.2089162

-.0020525

q21

-.0513229

.052818

-0.97

0.331

-.1548444

.0521985

q31

-.0293419

.0543746

-0.54

0.589

-.1359142

.0772303

_cons |

-3.017454

.3592294

-8.40

0.000

-3.72153

-2.313377

We can see how the probit model fits by looking at its predictions.

. estat classification Probit model for emp

— True —

Classified |

D

~D I

Total

+ I

8743

826 |

9569

– I

79

174 |

253

Total |

8822

1000 |

9822

Classified + if predicted Pr(D) >= .5 True D defined as emp!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+| D) Pr(-| ~D) Pr(D| +) Pr(~ D| -)

99.10%

17.40%

91.37%

68.77%

False + rate for true ~D

Pr(+| ~ D)

82.60%

False – rate for true D

Pr(-| D)

0.90%

False + rate for classified +

Pr(~D| +)

8.63%

False – rate for classified –

Pr(D| -)

31.23%

Correctly classified

90.79%

We could have alternatively run a logit regression on employment for males. logit emp hvdrnk90 ue88 age agesq educ married famsize white hlstat1 hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Logistic regression

Number of obs

= 9822

Wald chi2(20)

= 900.15

Prob > chi2

= 0.0000

Log pseudolikelihood = -2700.0567

Pseudo R2

= 0.1646

|

emp |

Coef.

Robust Std. Err.

z

P>|z|

[95% Conf. Interval]

hvdrnk90 |

-.1960754

.1114946

-1.76

0.079

-.4146008

.02245

ue88 |

-.1131074

.0273316

-4.14

0.000

-.1666764

-.0595384

age |

.1884486

.0332284

5.67

0.000

.123322

.2535751

agesq |

-.0024584

.0003965

-6.20

0.000

-.0032356

-.0016813

educ |

.0913569

.0127978

7.14

0.000

.0662738

.1164401

married |

.5534291

.1057963

5.23

0.000

.3460721

.760786

famsize |

.0365059

.0276468

1.32

0.187

-.0176808

.0906927

white

.7224036

.0912559

7.92

0.000

.5435454

.9012619

hlstat1

3.145481

.1721925

18.27

0.000

2.80799

3.482972

hlstat2

3.067279

.1741295

17.61

0.000

2.725992

3.408567

hlstat3

2.613691

.1707421

15.31

0.000

2.279042

2.948339

hlstat4

1.725571

.1844904

9.35

0.000

1.363976

2.087166

region1

.0493715

.1220065

0.40

0.686

-.1897568

.2884999

region2

.1146108

.105504

1.09

0.277

-.0921733

.3213948

region3

.3738274

.1066491

3.51

0.000

.1647991

.5828558

msa1

-.1690904

.1016459

-1.66

0.096

-.3683127

.0301319

msa2

.1345974

.1021625

1.32

0.188

-.0656374

.3348323

q1

-.1954528

.1034703

-1.89

0.059

-.3982508

.0073453

q2

-.1052494

.1033014

-1.02

0.308

-.3077163

.0972176

q3

-.0418287

.1074896

-0.39

0.697

-.2525045

.168847

.cons

-5.538271

.6935383

-7.99

0.000

-6.897581

-4.178961

And the corresponding predictions for the logit model are given by

. estat classification Logistic model for emp

…….. True –

Classified |

D

~D I

Total

+ I

8740

822 |

9562

– I

82

178 |

260

Total |

8822

1000 |

9822

Classified + if predicted Pr(D) >= .5 True D defined as emp!=0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+- D) Pr(—~D) Pr(D – +) Pr(~D – -)

99.07%

17.80%

91.40%

68.46%

False + rate for true ~D

Pr(+| ~D)

82.20%

False – rate for true D

Pr(–D)

0.93%

False + rate for classified +

Pr(~D – +)

8.60%

False – rate for classified –

Pr(D–)

31.54%

Correctly classified

90.80%

The marginal effects for the probit model can be obtained as follows:

.dprobit emp hvdrnk90 ue88 age agesq educ married famsize white hlstatl hlstat2 hlstat3 hlstat4 region1 region2 region3 msa1 msa2 q1 q2 q3, robust

Iteration 0: log pseudolikelihood =-3231.8973

Iteration 1: log pseudolikelihood =-2707.0435

Iteration 2: log pseudolikelihood =-2698.2015

Iteration 3: log pseudolikelihood =-2698.1797

Probit regression, reporting marginal effects

Number of obs

= 9822

Wald chi2(20)

= 928.34

Prob > chi2

= 0.0000

Log pseudolikelihood = -2698.1797

Pseudo R2

= 0.1651

|

emp |

dF/dx

Robust Std. Err.

z

P>|z|

x-bar

[95% Conf. Interval]

hvdrnk90[7] |

-.0161704

.0096242

-1.78

0.075

.099165

-.035034

.002693

ue88 |

-.0077362

.0020463

-3.75

0.000

5.56921

-.011747

-.003725

age |

.0144674

.0024796

5.82

0.000

39.1757

.009607

.019327

agesq |

-.0001894

.0000297

-6.36

0.000

1627.61

-.000248

-.000131

educ |

.0068513

.0009621

7.07

0.000

13.3096

.004966

.008737

married* |

.0488911

.010088

5.46

0.000

.816432

.029119

.068663

famsize |

.002743

.002039

1.34

0.179

2.7415

-.001253

.006739

white* |

.069445

.0100697

8.16

0.000

.853085

.049709

.089181

hlstat1 * |

.2460794

.0148411

18.47

0.000

.415903

.216991

.275167

hlstat2* |

.1842432

.0099207

17.94

0.000

.301873

.164799

.203687

hlstat3* |

.130786

.0066051

15.75

0.000

.205254

.11784

.143732

hlstat4* |

.0779836

.0041542

9.69

0.000

.053451

.069841

.086126

region1 * |

.0049107

.0087468

0.55

0.580

.203014

-.012233

.022054

region2* |

.0086088

.0075003

1.12

0.261

.265628

-.006092

.023309

region3* |

.0252543

.0071469

3.36

0.001

.318265

.011247

.039262

msa1 * |

-.0107946

.0077889

-1.41

0.159

.333232

-.026061

.004471

msa2* |

.0109542

.0073524

1.48

0.139

.434942

-.003456

.025365

q1* |

-.0158927

.0082451

-2.00

0.046

.254632

-.032053

.000267

q2* |

-.0075883

.0079484

-0.97

0.331

.252698

-.023167

.00799

q3* |

-.0043066

.0080689

-0.54

0.589

.242822

-.020121

.011508

obs. P |

.8981877

pred. P |

.9224487

(at x-bar)

13.15 Fertility and Female Labor Supply

a. Carrasco (2001, p. 391) Table 4, column 1, ran a fertility probit equation, which we replicate below using Stata:

probit f dsex ags26l educ_2 educ_3 age drace inc

Probit regression

Number of obs

= 5768

LR chi2 (7)

= 964.31

Prob > chi2

= 0.0000

Log likelihood = -1561.1312

Pseudo R2

= 0.2360

f

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

dsex

.3250503

.0602214

5.40

0.000

.2070184

.4430822

ags26l

-2.135365

.1614783

-13.22

0.000

-2.451857

-1.818873

educ_2

.0278467

.1145118

0.24

0.808

-.1965922

.2522856

educ_3

.3071582

.1255317

2.45

0.014

.0611207

.5531958

age

-.0808522

.0048563

-16.65

0.000

-.0903703

-.071334

drace

-.0916409

.0629859

-1.45

0.146

-.215091

.0318093

inc

.003161

.0029803

1.06

0.289

-.0026803

.0090022

_cons

1.526893

.1856654

8.22

0.000

1.162996

1.890791

For part (b) the predicted probabilities are obtained as follows:

. Istat

Probit model for f

True-

Classified

D

~D

Total

+

2

3

5

654

5109

5763

Total

656

5112

5768

Classified + if predicted Pr(D) >= .5 True D defined as f!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+| D) Pr(-| ~D) Pr(D| +) Pr(~D| -)

0.30%

99.94%

40.00%

88.65%

False + rate for true ~D

Pr(+| ~D)

0.06%

False – rate for true D

Pr(-| D)

99.70%

False + rate for classified +

Pr(~D| +)

60.00%

False – rate for classified –

Pr( Dj -)

11.35%

Correctly classified

88.61%

The estimates reveal that having children of the same sex has a significant and positive effect on the probability of having an additional child. The marginal effects are given by dprobit in Stata. dprobit f dsex ags26l educ_2 educ_3 age drace inc

Probit regression, reporting marginal effects

Number of obs

= 5768

LR chi2 (7)

= 964.31

Prob > chi2

= 0.0000

Log likelihood = —1561.1312

Pseudo R2

= 0.2360

f

dF/dx

Std. Err.

z

P>|z|

x-bar

[95% C. I.]

dsex*

.0302835

.0069532

5.40

0.000

.256415

.016655

.043912

ags26l*

-.1618148

.0066629

-13.22

0.000

.377601

-.174874

-.148756

educ_2*

.0022157

.0090239

0.24

0.808

.717753

-.015471

.019902

educ_3*

.0288636

.0140083

2.45

0.014

.223994

.001408

.056319

age

-.0065031

.0007644

-16.65

0.000

32.8024

-.008001

-.005005

drace*

-.0077119

.0055649

-1.45

0.146

.773232

-.018619

.003195

inc

.0002542

.000241

1.06

0.289

12.8582

-.000218

.000727

obs. P

.1137309

pred. P

.0367557

(at x-bar)

(*) dF/dx is for discrete change of dummy variable from 0 to 1 z and P> |z| correspond to the test of the underlying coefficient being 0

If we replace same sex by its components: same sex female and same sex male variables, the results do not change indicating that having both boys or girls does not matter, see Carrasco (2001,p.391) Table 4, column 2.

. probit f dsexm dsexf ags26l educ_2 educ_3 age drace inc

Probit regression

Number of obs

= 5768

LR chi2 (8)

= 964.32

Prob > chi2

= 0.0000

Log likelihood = -1561.1284

Pseudo R2

= 0.2360

f

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

dsexm

.328542

.0764336

4.30

0.000

.1787349

.4783491

dsexf

.3209239

.0820417

3.91

0.000

.1601252

.4817226

ags26l

-2.135421

.1614518

-13.23

0.000

-2.451861

-1.818981

educ_2

.027657

.1145384

0.24

0.809

-.1968342

.2521482

educ_3

.3068706

.1255904

2.44

0.015

.0607179

.5530233

age

-.0808669

.0048605

-16.64

0.000

-.0903934

-.0713404

drace

-.0918074

.0630233

-1.46

0.145

-.2153308

.031716

inc

.0031709

.0029829

1.06

0.288

-.0026754

.0090173

_cons

1.527551

.1858818

8.22

0.000

1.163229

1.891872

Probit model for f

True-

Classified

D

Total

+

2

3

5

654

5109

5763

Total

656

5112

5768

Classified + if predicted Pr(D) >= .5 True D defined as f!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+| D) Pr(-| ~D) Pr(D| +) Pr(-D| -)

0.30%

99.94%

40.00%

88.65%

False + rate for true —D

Pr(+| ~D)

0.06%

False – rate for true D

Pr(-| D)

99.70%

False + rate for classified +

Pr(-D| +)

60.00%

False – rate for classified –

Pr(D| -)

11.35%

Correctly classified

88.61%

. dprobit f dsexm dsexf ags26l educ_2 educ_3 age drace inc

Probit regression, reporting marginal effects

Number of obs

= 5768

LR chi2 (7)

= 964.32

Prob > chi2

= 0.0000

Log likelihood = —1561.1284

Pseudo R2

= 0.2360

dF/dx

Std. Err.

z

P>|z|

x-bar

[95% C. I.]

dsexm[8]

.0325965

.0095475

4.30

0.000

.145111

.013884

.051309

dsexf*

.032261

.0103983

3.91

0.000

.111304

.011881

.052641

ags26l*

-.16182

.0066634

-13.23

0.000

.377601

-.17488

-.14876

educ_2*

.0022008

.0090273

0.24

0.809

.717753

-.015492

.019894

educ_3*

.0288323

.01401

2.44

0.015

.223994

.001373

.056291

age

-.0065042

.0007645

-16.64

0.000

32.8024

-.008003

-.005006

drace*

-.0077266

.0055692

-1.46

0.145

.773232

-.018642

.003189

inc

.000255

.0002412

1.06

0.288

12.8582

-.000218

.000728

obs. P

.1137309

pred. P

.0367556

(at x-bar)

c. Carrasco (2001, p. 392) Table 5, column 4, ran a female labor force participation OLS equation, which we replicate below using Stata 10:

. reg dhw f ags26l fxag26l educ_2 educ_3 age drace inc dhwl

Number of obs

= 5768

F(9, 5758)

= 445.42

Prob > F

= 0.0000

R-squared

= 0.4104

Adj R-squared

= 0.4095

Root MSE

= .32361

dhw

Coef.

Std. Err.

t

P>|t|

[95% Conf. Interval]

f

-.0888995

.0144912

-6.13

0.000

-.1173077

-.0604912

ags26l

-.0194454

.0093334

-2.08

0.037

-.0377424

-.0011484

fxag26l

-.0581458

.1629414

-0.36

0.721

-.3775723

.2612806

educ_2

.0491989

.0186018

2.64

0.008

.0127323

.0856655

educ_3

.0725501

.0207404

3.50

0.000

.0318912

.1132091

age

.0014193

.0007854

1.81

0.071

-.0001203

.002959

drace

-.0098333

.010379

-0.95

0.343

-.03018

.0105134

inc

-.0018149

.0004887

-3.71

0.000

-.002773

-.0008568

dhwl

.6253973

.0103188

60.61

0.000

.6051686

.645626

_cons

.2373022

.032744

7.25

0.000

.1731117

.3014927

Carrasco (2001, p. 392) Table 5, column 1, ran a female labor force participation probit equation, which we replicate below using Stata:

. probit dhw f ags26l fxag26l educ_2 educ_3 age drace inc dhwl

Number of obs = 5768

LR chi2 (7) = 2153.17

Prob > chi2 = 0.0000

Pseudo R2 = 0.3458

Log likelihood = -2036.8086

dhw

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

f

-.4103849

.0690538

-5.94

0.000

-.5457279

-.2750419

ags26l

-.1064159

.0480907

-2.21

0.027

-.200672

-.0121598

fxag26l

-.1886427

.7087803

-0.27

0.790

-1.577827

1.200541

educ_2

.2338264

.0858408

2.72

0.006

.0655816

.4020713

educ_3

.3773278

.1001949

3.77

0.000

.1809494

.5737062

age

.0091203

.0041132

2.22

0.027

.0010586

.017182

drace

-.0577508

.0542972

-1.06

0.288

-.1641714

.0486699

inc

-.0088483

.0024217

-3.65

0.000

-.0135948

-.0041019

dhwl

1.932025

.0462191

41.80

0.000

1.841438

2.022613

_cons

-.8540838

.1638299

-5.21

0.000

-1.175184

-.5329831

Probit model for dhw

Classified

D

~D

Total

+

4073

378

4451

366

951

1317

Total

4439

1329

5768

Classified + if predicted Pr(D) >= .5 True D defined as dhw!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+| D) Pr(-| ~D) Pr(D| +) Pr(~D| -)

91.75%

71.56%

91.51%

72.21%

False + rate for true ~D

Pr(+| ~D)

28.44%

False – rate for true D

Pr(- D)

8.25%

False + rate for classified +

Pr(~D| +)

8.49%

False – rate for classified –

Pr( Dj -)

27.79%

Correctly classified

87.10%

The marginal effects are given by dprobit in Stata:

. dprobit dhw f ags26l fxag26l educ_2 educ_3 age drace inc dhwl

Number of obs = 5768

LR chi2 (9) = 2153.17

Prob > chi2 = 0.0000

Pseudo R2 = 0.3458

dhw

dF/dx

Std. Err.

z

P>|z|

x-bar

[95% C. I.]

f[9]

-.1200392

.0224936

-5.94

0.000

.113731

-.164126

-.075953

ags26l*

-.0275503

.0125892

-2.21

0.027

.377601

-.052225

-.002876

fxag26l*

-.0524753

.2127127

-0.27

0.790

.000693

-.469385

.364434

educ_2*

.0626367

.0239923

2.72

0.006

.717753

.015613

.109661

educ_3*

.0870573

.0206089

3.77

0.000

.223994

.046665

.12745

age

.0023327

.0010504

2.22

0.027

32.8024

.000274

.004391

drace*

-.0145508

.0134701

-1.06

0.288

.773232

-.040952

.01185

inc

-.0022631

.0006189

-3.65

0.000

12.8582

-.003476

-.00105

dhwl*

.6249756

.0134883

41.80

0.000

.771671

.598539

.651412

obs. P

.7695908

pred. P

.8271351

(at x-bar)

d. The 2sls estimates in Table 5, column 5, of Carrasco (2001, p. 392) using as instruments the same sex variables and their interactions with ags26l is given below, along with the over-identification test and the first stage diagnostics:

. ivregress 2sls dhw (f fxag26l =dsexm dsexf sexm_26l sexf_26l) ags26l educ_2 e > duc_3 age drace inc dhwl

Instrumental variables (2SLS) regression

Number of obs =

5768

Wald chi2(9) =

3645.96

Prob > chi2 =

0.0000

R-squared =

0.3565

Root MSE =

.3378

dhw

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

f

-.2164685

.2246665

-0.96

0.335

-.6568067

.2238697

fxag26l

-3.366305

3.512783

-0.96

0.338

-10.25123

3.518623

ags26l

-.0385731

.0467522

-0.83

0.409

-.1302058

.0530596

educ_2

.0331807

.0288653

1.15

0.250

-.0233943

.0897557

educ_3

.064607

.0348694

1.85

0.064

-.0037357

.1329497

age

-.0001934

.0030344

-0.06

0.949

-.0061407

.0057539

drace

-.0163251

.012366

-1.32

0.187

-.0405621

.0079118

inc

-.0017194

.0005162

-3.33

0.001

-.0027312

-.0007076

dhwl

.6230639

.017256

36.11

0.000

.5892427

.6568851

_cons

.3330965

.141537

2.35

0.019

.0556891

.610504

Instrumented: f fxag26l

Instruments: ags26l educ_2 educ_3 age drace inc dhwl dsexm dsexf sexm_26l sexf_26l

. estat overid

Tests of overidentifying restrictions:

Sargan (score) chi2(2) = .332468 (p = 0.8468) Basmann chi2(2) =.331796 (p = 0.8471)

. estat firststage

Shea’s partial R-squared

Shea’s

Shea’s

Variable

Partial R-sq.

Adj. Partial R-sq.

f

0.0045

0.0028

fxag26l

0.0023

0.0006

Minimum eigenvalue statistic = 3.36217

Critical Values # of endogenous regressors: 2

Ho: Instruments are weak # of excluded instruments: 4

5%

10%

20%

30%

2SLS relative bias

11.04

7.56

5.57

4.73

10%

15%

20%

25%

2SLS Size of nominal 5% Wald test

16.87

9.93

7.54

6.28

LIML Size of nominal 5% Wald test

4.72

3.39

2.99

2.79

e. So far, heterogeneity across the individuals is not taken into account. Carrasco (2001, p. 393) Table 7, column 4, ran a female labor force par­ticipation fixed effects equation with robust standard errors, which we replicate below using Stata:

. xtreg dhw f ags26l fxag26l dhwl, fe r

Fixed-effects (within) regression

Number of obs

= 5768

Group variable: ident

Number of groups

= 1442

R-sq: within = 0.0059

Obs per group: min

=4

between = 0.6185

avg

= 4.0

overall = 0.2046

max

=4

F(4,4322)

= 4.64

corr(u_i, Xb) = 0.4991

Prob > F

= 0.0010

(Std. Err. adjusted for clustering on ident)

dhw

Coef.

Robust Std. Err.

t

P>|t|

[95% Conf. Interval]

f

-.0547777

.0155326

-3.53

0.000

-.0852296

-.0243257

ags26l

.0012836

.0126213

0.10

0.919

-.0234607

.0260279

fxag26l

-.2204885

.2013721

-1.09

0.274

-.615281

.1743041

dhwl

.0356233

.0236582

1.51

0.132

-.0107588

.0820055

_cons

.7479995

.0193259

38.70

0.000

.7101108

.7858881

sigma_u

sigma_e

rho

.33260036

.27830212

.58818535

(fraction of variance due to u_i)

Note that only fertility is significant in this equation.

Fixed effects 2sls using as instruments the same sex variables and their interactions with ags26l is given below: . xtivreg dhw (f fxag26l =dsexm dsexf sexm_26l sexf_26l)ags26l age inc dhwl, fe

Fixed-effects (within) IV regression

Number of obs =

5768

Group variable: ident

Number of groups =

1442

R-sq: within = .

Obs per group: min =

4

between = 0.1125

avg =

4.0

overall = 0.0332

max =

4

Wald chi2(6) =

39710.29

corr(u_i, Xb) = 0.0882

Prob > chi2 =

0.0000

dhw

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

f

-.2970225

.156909

-1.89

0.058

-.6045584

.0105134

fxag26l

-2.1887

2.433852

-0.90

0.369

-6.958963

2.581562

ags26l

-.0584866

.0467667

-1.25

0.211

-.1501476

.0331744

age

.000651

.0043265

0.15

0.880

-.0078287

.0091307

inc

-.0011213

.0010482

-1.07

0.285

-.0031758

.0009331

dhwl

.0362943

.0160524

2.26

0.024

.0048322

.0677565

_cons

.7920305

.1623108

4.88

0.000

.4739071

1.110154

sigma_u

.3293446

sigma_e

.29336161

rho

.55759255

(fraction of variance due to u. i)

F test that all u_

=0:

F(1441,4320)

= 2.16

Prob > F = 0.0000

Instrumented: f fxag26l

Instruments: ags26l age inc dhwl dsexm dsexf sexm_26l sexf_26l

13.16 multinomial logit model

a. Table II of Terza (2002, p. 399) columns 3,4, 9 and 10 are replicated below for the male data using Stata:

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent verygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseoutcome(1)

У

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

2

alc90th

.1270931

.21395

0.59

0.552

-.2922412

.5464274

ue88

.0458099

.051355

0.89

0.372

-.0548441

.1464639

age

.1617634

.0663205

2.44

0.015

.0317776

.2917492

agesq

-.0024377

.0007991

-3.05

0.002

-.004004

-.0008714

schooling

-.0092135

.0245172

-0.38

0.707

-.0572664

.0388393

married

.4004928

.1927458

2.08

0.038

.022718

.7782677

famsize

.0622453

.0503686

1.24

0.217

-.0364753

.1609659

white

.0391309

.1705625

0.23

0.819

-.2951653

.3734272

excellent

2.91833

.4486757

6.50

0.000

2.038942

3.797719

verygood

2.978336

.4505932

6.61

0.000

2.09519

3.861483

good

2.493939

.4446815

5.61

0.000

1.622379

3.365499

fair

1.460263

.4817231

3.03

0.002

.5161027

2.404422

northeast

.0849125

.2374365

0.36

0.721

-.3804545

.5502796

midwest

.0158816

.2037486

0.08

0.938

-.3834583

.4152215

south

.1750244

.2027444

0.86

0.388

-.2223474

.5723962

centercity

-.2717445

.1911074

-1.42

0.155

-.6463081

.1028192

othermsa

-.0921566

.1929076

-0.48

0.633

-.4702486

.2859354

q1

.422405

.1978767

2.13

0.033

.0345738

.8102362

q2

-.0219499

.2056751

-0.11

0.915

-.4250657

.3811659

q3

-.0365295

.2109049

-0.17

0.862

-.4498954

.3768364

_cons

-6.113244

1.427325

-4.28

0.000

-8.910749

-3.315739

3

alc90th

-.1534987

.1395003

-1.10

0.271

-.4269144

.1199169

ue88

-.0954848

.033631

-2.84

0.005

-.1614004

-.0295693

age

.227164

.0409884

5.54

0.000

.1468282

.3074999

agesq

-.0030796

.0004813

-6.40

0.000

-.0040228

-.0021363

schooling

.0890537

.0152314

5.85

0.000

.0592008

.1189067

married

.7085708

.1219565

5.81

0.000

.4695405

.9476012

famsize

.0622447

.0332365

1.87

0.061

-.0028975

.127387

white

.7380044

.1083131

6.81

0.000

.5257147

.9502941

excellent

3.702792

.1852415

19.99

0.000

3.339725

4.065858

verygood

3.653313

.1894137

19.29

0.000

3.282069

4.024557

good

2.99946

.1786747

16.79

0.000

2.649264

3.349656

fair

1.876172

.1885159

9.95

0.000

1.506688

2.245657

northeast

.088966

.1491191

0.60

0.551

-.203302

.3812341

midwest

.1230169

.1294376

0.95

0.342

-.130676

.3767099

south

.4393047

.1298054

3.38

0.001

.1848908

.6937185

centercity

-.2689532

.1231083

-2.18

0.029

-.510241

-.0276654

othermsa

.0978701

.1257623

0.78

0.436

-.1486195

.3443598

q1

-.0274086

.1286695

-0.21

0.831

-.2795961

.224779

q2

-.110751

.126176

-0.88

0.380

-.3580514

.1365494

q3

-.0530835

.1296053

-0.41

0.682

-.3071052

.2009382

_cons

-6.237275

.8886698

-7.02

0.000

-7.979036

-4.495515

(y==1 is the base outcome)

**using bootstrap for the var-cov matrix

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent ver > ygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseout > come(1) vce(bootstrap)

(running mlogit on estimation sample)

Bootstrap replications (50)

—-+—1 —+— 2 —+— 3 —+— 4 —+— 5 ………………………. 50

Multinomial logistic regression

Number of obs =

9822

Replications =

50

Wald chi2 (40) =

7442.69

Prob > chi2 =

0.0000

Log likelihood = -3217.481

Pseudo R2 =

0.1655

Observed

Bookstrap

Normal-based

y

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

2

alc90th

.1270931

.1933016

0.66

0.511

-.251771

.5059573

ue88

.0458099

.0566344

0.81

0.419

-.0651914

.1568112

age

.1617634

.0615543

2.63

0.009

.0411192

.2824076

agesq

-.0024377

.0007266

-3.35

0.001

-.0038619

-.0010135

schooling

-.0092135

.0249799

-0.37

0.712

-.0581732

.0397462

married

.4004928

.2069088

1.94

0.053

-.0050409

.8060266

famsize

.0622453

.0534164

1.17

0.244

-.0424489

.1669395

white

.0391309

.1817052

0.22

0.829

-.3170046

.3952665

excellent

2.91833

.5134264

5.68

0.000

1.912033

3.924628

verygood

2.978336

.5473854

5.44

0.000

1.905481

4.051192

good

2.493939

.4904972

5.08

0.000

1.532582

3.455296

fair

1.460263

.5156181

2.83

0.005

.4496697

2.470855

northeast

.0849125

.2058457

0.41

0.680

-.3185377

.4883627

midwest

.0158816

.197601

0.08

0.936

-.3714093

.4031725

south

.1750244

.2211406

0.79

0.429

-.2584032

.608452

centercity

-.2717445

.1708023

-1.59

0.112

-.6065108

.0630218

othermsa

-.0921566

.191577

-0.48

0.630

-.4676406

.2833275

q1

.422405

.2392306

1.77

0.077

-.0464783

.8912883

q2

-.0219499

.2404712

-0.09

0.927

-.4932649

.4493651

q3

-.0365295

.2500046

-0.15

0.884

-.5265295

.4534704

.cons

-6.113244

1.259449

-4.85

0.000

-8.581719

-3.644769

3

alc90th

-.1534987

.1129983

-1.36

0.174

-.3749714

.0679739

ue88

-.0954848

.0349536

-2.73

0.006

-.1639927

-.026977

age

.227164

.0431

5.27

0.000

.1426896

.3116385

agesq

-.0030796

.0005224

-5.90

0.000

-.0041034

-.0020558

schooling

.0890537

.0173814

5.12

0.000

.0549868

.1231207

married

.7085708

.1286085

5.51

0.000

.4565028

.9606389

famsize

.0622447

.0361903

1.72

0.085

-.0086869

.1331764

white

.7380044

.1320206

5.59

0.000

.4792488

.9967599

excellent

3.702792

.2019607

18.33

0.000

3.306956

4.098627

verygood

3.653313

.2090086

17.48

0.000

3.243663

4.062962

good

2.99946

.2053791

14.60

0.000

2.596925

3.401996

fair

1.876172

.2063004

9.09

0.000

1.471831

2.280514

northeast

.088966

.1624429

0.55

0.584

-.2294162

.4073482

midwest

.1230169

.1410455

0.87

0.383

-.1534272

.399461

south

.4393047

.1340076

3.28

0.001

.1766547

.7019547

centercity

-.2689532

.098325

-2.74

0.006

-.4616666

-.0762398

othermsa

.0978701

.1067784

0.92

0.359

-.1114117

.307152

q1

-.0274086

.1206965

-0.23

0.820

-.2639694

.2091523

q2

-.110751

.1303469

-0.85

0.396

-.3662263

.1447243

q3

-.0530835

.1329726

-0.40

0.690

-.313705

.2075381

.cons

-6.237275

.8224026

-7.58

0.000

-7.849155

-4.625396

(y==1 is the base outcome)

**using robust for the var-cov matrix

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent ver > ygood good fair northeast midwest south centercity othermsa q1 q2 q3, baseout > come(1) vce(robust)

Iteration 0: log pseudolikelihood = -3855.7148 Iteration 1: log pseudolikelihood = -3692.5753 Iteration 2: log pseudolikelihood = -3526.5092 Iteration 3: log pseudolikelihood = -3236.3918 Iteration 4: log pseudolikelihood = -3219.1826 Iteration 5: log pseudolikelihood = -3217.5569 Iteration 6: log pseudolikelihood = -3217.4813 Iteration 7: log pseudolikelihood = -3217.481

Multinomial logistic regression Number of obs = 9822

Wald chi2 (40) = 1075.69

Prob > chi2 = 0.0000

Log pseudolikelihood = —3217.481 Pseudo R2 = 0.1655

Robust

y

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

2

alc90th

.1270931

.2152878

0.59

0.555

-.2948632

.5490494

ue88

.0458099

.0500181

0.92

0.360

-.0522238

.1438436

age

.1617634

.0668732

2.42

0.016

.0306944

.2928324

agesq

-.0024377

.0008087

-3.01

0.003

-.0040227

-.0008527

schooling

-.0092135

.0234188

-0.39

0.694

-.0551135

.0366864

married

.4004928

.204195

1.96

0.050

.0002779

.8007078

famsize

.0622453

.0517847

1.20

0.229

-.0392509

.1637416

white

.0391309

.1711588

0.23

0.819

-.2963342

.3745961

excellent

2.91833

.4548999

6.42

0.000

2.026743

3.809918

verygood

2.978336

.4566665

6.52

0.000

2.083286

3.873386

good

2.493939

.4507366

5.53

0.000

1.610511

3.377366

fair

1.460263

.48807

2.99

0.003

.5036629

2.416862

northeast

.0849125

.23845

0.36

0.722

-.382441

.552266

midwest

.0158816

.2044175

0.08

0.938

-.3847694

.4165326

south

.1750244

.2022599

0.87

0.387

-.2213977

.5714466

centercity

-.2717445

.1911311

-1.42

0.155

-.6463546

.1028656

othermsa

-.0921566

.1955115

-0.47

0.637

-.475352

.2910389

q1

.422405

.1970871

2.14

0.032

.0361213

.8086887

q2

-.0219499

.2049964

-0.11

0.915

-.4237355

.3798357

q3

-.0365295

.2109886

-0.17

0.863

-.4500595

.3770005

_cons

-6.113244

1.412512

-4.33

0.000

-8.881717

-3.344771

3

alc90th

-.1534987

.1392906

-1.10

0.270

-.4265033

.1195059

ue88

-.0954848

.0335442

-2.85

0.004

-.1612303

-.0297394

age

.227164

.0411389

5.52

0.000

.1465333

.3077948

agesq

-.0030796

.000487

-6.32

0.000

-.004034

-.0021251

schooling

.0890537

.0160584

5.55

0.000

.0575798

.1205276

married

.7085708

.1315325

5.39

0.000

.4507719

.9663698

famsize

.0622447

.035511

1.75

0.080

-.0073556

.1318451

white

.7380044

.1139831

6.47

0.000

.5146017

.9614071

excellent

3.702792

.190178

19.47

0.000

3.33005

4.075534

verygood

3.653313

.1929514

18.93

0.000

3.275135

4.03149

good

2.99946

.1849776

16.22

0.000

2.636911

3.36201

fair

1.876172

.1956878

9.59

0.000

1.492631

2.259713

northeast

.088966

.1505301

0.59

0.555

-.2060675

.3839996

midwest

.1230169

.1302651

0.94

0.345

-.1322981

.3783319

south

.4393047

.1341061

3.28

0.001

.1764616

.7021478

centercity

-.2689532

.1266976

-2.12

0.034

-.5172758

-.0206306

othermsa

.0978701

.1275274

0.77

0.443

-.152079

.3478193

q1

-.0274086

.1288453

-0.21

0.832

-.2799406

.2251235

q2

-.110751

.12602

-0.88

0.379

-.3577457

.1362437

q3

-.0530835

.1321321

-0.40

0.688

-.3120576

.2058907

_cons

-6.237275

.8601993

-7.25

0.000

-7.923235

-4.551316

(y==1 is the base outcome)

b. For the female data, the multinomial logit estimates yield:

. mlogit y alc90th ue88 age agesq schooling married famsize white excellent verygood good fair northeast midwest south centercity othermsa q1 q2 q3,

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

2

alc90th

-.1241993

.2365754

-0.52

0.600

-.5878785

.3394799

ue88

-.001862

.0514214

-0.04

0.971

-.1026462

.0989221

age

-.0392239

.0612728

-0.64

0.522

-.1593164

.0808687

agesq

.0004834

.0007411

0.65

0.514

-.0009691

.0019359

schooling

-.0121174

.0254645

-0.48

0.634

-.0620269

.037792

married

.0117958

.2220045

0.05

0.958

-.423325

.4469167

famsize

.0092434

.0495871

0.19

0.852

-.0879456

.1064324

white

.2817941

.1935931

1.46

0.146

-.0976414

.6612296

excellent

.0420423

.4579618

0.09

0.927

-.8555463

.939631

verygood

.0449091

.4574373

0.10

0.922

-.8516516

.9414698

good

.0182444

.4544742

0.04

0.968

-.8725086

.9089974

fair

.2925131

.4839658

0.60

0.546

-.6560424

1.241069

northeast

-.1721726

.2163151

-0.80

0.426

-.5961425

.2517973

midwest

-.2643294

.1944624

-1.36

0.174

-.6454687

.11681

south

-.0161982

.1814209

-0.09

0.929

-.3717766

.3393803

centercity

-.0812978

.1869101

-0.43

0.664

-.447635

.2850393

othermsa

.044578

.1738872

0.26

0.798

-.2962347

.3853908

q1

22.30515

1.328553

16.79

0.000

19.70123

24.90906

q2

22.24068

1.32893

16.74

0.000

19.63603

24.84534

q3

18.65596

1.360765

13.71

0.000

15.98891

21.32301

_cons

-23.50938

3

alc90th

.1288509

.0812475

1.59

0.113

-.0303912

.2880931

ue88

.0148758

.0188237

0.79

0.429

-.022018

.0517696

age

.0175243

.0230613

0.76

0.447

-.0276751

.0627236

agesq

-.0002381

.00028

-0.85

0.395

-.0007868

.0003106

schooling

.0035127

.0095824

0.37

0.714

-.0152685

.0222939

married

-.0997914

.078327

-1.27

0.203

-.2533095

.0537266

famsize

.0027002

.0184619

0.15

0.884

-.0334844

.0388849

white

-.0277798

.066196

-0.42

0.675

-.1575217

.1019621

excellent

-.1178398

.1636194

-0.72

0.471

-.4385278

.2028483

verygood

-.1170045

.1633395

-0.72

0.474

-.437144

.203135

good

-.1144024

.1622966

-0.70

0.481

-.4324979

.2036931

fair

-.0344312

.1775054

-0.19

0.846

-.3823353

.3134729

northeast

-.0548967

.0819514

-0.67

0.503

-.2155184

.105725

midwest

.0572296

.0720545

0.79

0.427

-.0839946

.1984538

(y==1 is the base outcome)

13.17 Tobit estimation of Married Women Labor Supply

a. A detailed summary of the hours of work show that mean hours of work is
741, the median is 288, the minimum is zero and the maximum is 4950.

. sum hours, detail

hours worked, 1975

Percentiles

Smallest

1%

0

0

5%

0

0

10%

0

0

Obs

753

25%

0

0

Sum of Wgt.

753

50%

288

Mean

740.5764

Largest

Std. Dev.

871.3142

75%

1516

3640

90%

1984

3686

Variance

759188.5

95%

2100

4210

Skewness

.9225315

99%

3087

4950

Kurtosis

3.193949

b. Using the notation of solution 11.31, OLS on this model yields

. reg hours nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, r

Linear regression Number of obs = 753

F(7,745) = 45.81

Prob > F = 0.0000

R-squared = 0.2656

Root MSE = 750.18

hours

Coef.

Robust Std. Err.

t

P>|t|

[95% Conf. Interval]

nwifeinc

-3.446636

2.240662

-1.54

0.124

-7.845398

.9521268

kidslt6

-442.0899

57.46384

-7.69

0.000

-554.9002

-329.2796

kidsge6

-32.77923

22.80238

-1.44

0.151

-77.5438

11.98535

age

-30.51163

4.244791

-7.19

0.000

-38.84481

-22.17846

educ

28.76112

13.03905

2.21

0.028

3.163468

54.35878

exper

65.67251

10.79419

6.08

0.000

44.48186

86.86316

expersq

-.7004939

.3720129

-1.88

0.060

-1.430812

.0298245

_cons

1330.482

274.8776

4.84

0.000

790.8556

1870.109

Tobit estimation with left censoring at zero is represented by the option ll(0)

. tobit hours nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, ll(0)

Tobit regression

Number of obs

= 753

LR chi2 (7)

= 271.59

Prob > chi2

= 0.0000

Log likelihood = -3819.0946

Pseudo R2

= 0.0343

hours

Coef.

Std. Err.

t

P>|t|

[95% Conf. Interval]

nwifeinc

-8.814243

4.459096

-1.98

0.048

-17.56811

-.0603724

kidslt6

-894.0217

111.8779

-7.99

0.000

-1113.655

-674.3887

kidsge6

-16.218

38.64136

-0.42

0.675

-92.07675

59.64075

age

-54.40501

7.418496

-7.33

0.000

-68.96862

-39.8414

educ

80.64561

21.58322

3.74

0.000

38.27453

123.0167

exper

131.5643

17.27938

7.61

0.000

97.64231

165.4863

expersq

-1.864158

.5376615

-3.47

0.001

-2.919667

-.8086479

_cons

965.3053

446.4358

2.16

0.031

88.88528

1841.725

/sigma

1122.022

41.57903

1040.396

1203.647

Obs. summary: 325 left-censored observations at hours<=0

428 uncensored observations 0 right-censored observations

c. This replicates Table 17.1 of Wooldridge (2009, p. 585) using Stata

. reg inlf nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, r

Linear regression Number of obs = 753

F(7,745) = 62.48

Prob > F = 0.0000

R-squared = 0.2642

Root MSE = .42713

inlf

Coef.

Robust Std. Err.

t

P>|t|

[95% Conf. Interval]

nwifeinc

-.0034052

.0015249

-2.23

0.026

-.0063988

-.0004115

kidslt6

-.2618105

.0317832

-8.24

0.000

-.3242058

-.1994152

kidsge6

.0130122

.0135329

0.96

0.337

-.013555

.0395795

age

-.0160908

.002399

-6.71

0.000

-.0208004

-.0113812

educ

.0379953

.007266

5.23

0.000

.023731

.0522596

exper

.0394924

.00581

6.80

0.000

.0280864

.0508983

expersq

-.0005963

.00019

-3.14

0.002

-.0009693

-.0002233

_cons

.5855192

.1522599

3.85

0.000

.2866098

.8844287

The Logit estimates yield:

. logit inlf nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, r Iteration 0: log pseudolikelihood = -514.8732 Iteration 1: log pseudolikelihood = -402.38502 Iteration 2: log pseudolikelihood = -401.76569 Iteration 3: log pseudolikelihood = -401.76515 Iteration 4: log pseudolikelihood = -401.76515

Logistic regression

Number of obs

= 753

Wald chi2 (7)

= 158.48

Prob > chi2

= 0.0000

Log pseudolikelihood = -401.76515

Pseudo R2

= 0.2197

inlf

Coef.

Robust Std. Err.

z

P>|z|

[95% Conf. Interval]

nwifeinc

-.0213452

.0090782

-2.35

0.019

-.039138

-.0035523

kidslt6

-1.443354

.2031615

-7.10

0.000

-1.841543

-1.045165

kidsge6

.0601122

.0798825

0.75

0.452

-.0964546

.2166791

age

-.0880244

.0144393

-6.10

0.000

-.1163248

-.0597239

educ

.2211704

.0444509

4.98

0.000

.1340482

.3082925

exper

.2058695

.0322914

6.38

0.000

.1425796

.2691594

expersq

-.0031541

.0010124

-3.12

0.002

-.0051384

-.0011698

_cons

.4254524

.8597308

0.49

0.621

-1.259589

2.110494

. estat classification

Logistic model for inlf

Classified

D

~D

Total

+

347

118

465

81

207

288

True-

Total

428

325 I 753

Classified + if predicted Pr(D) >= .5 True D defined as inlf!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+I D) Pr(-| ~D) Pr(D| +) Pr(~D| -)

81.07%

63.69%

74.62%

71.88%

False + rate for true ~D

Pr(+I ~D)

36.31%

False – rate for true D

Pr(- D)

18.93%

False + rate for classified +

Pr(~D| +)

25.38%

False – rate for classified –

Pr( Dj -)

28.13%

Correctly classified

73.57%

. mfx

Marginal effects after logit y = Pr(inlf) (predict)

= .58277201

variable

dy/dx

Std. Err.

z

P>|z|

[95% C. I

]

X

nwifeinc

-.0051901

.00221

-2.35

0.019

-.009523 –

.000857

20.129

kidslt6

-.3509498

.04988

-7.04

0.000

-.448718

.253182

.237716

kidsge6

.0146162

.01941

0.75

0.451

-.023428

.05266

1.35325

age

-.021403

.00353

-6.07

0.000

-.028317

.014489

42.5378

educ

.0537773

.01086

4.95

0.000

.032498

.075057

12.2869

exper

.0500569

.00788

6.35

0.000

.034604

.06551

10.6308

expersq

-.0007669

.00025

-3.11

0.002

-.001251

.000283

178.039

. margeff

Average partial effects after logit y = Pr(inlf)

variable

Coef.

Std. Err.

z

P>jzj

[95% Conf. Interval]

nwifeinc

-.0038118

.0015923

-2.39

0.017

-.0069327

-.0006909

kidslt6

-.240805

.0262576

-9.17

0.000

-.292269

-.189341

kidsge6

.0107335

.0142337

0.75

0.451

-.017164

.038631

age

-.0157153

.0023842

-6.59

0.000

-.0203883

-.0110423

educ

.0394323

.0074566

5.29

0.000

.0248176

.0540471

exper

.0367123

.0051935

7.07

0.000

.0265332

.0468914

expersq

-.0005633

.0001767

-3.19

0.001

-.0009096

-.0002169

. probit inlf nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, r Iteration 0: log pseudolikelihood = -514.8732 Iteration 1: log pseudolikelihood = -402.06651 Iteration 2: log pseudolikelihood = -401.30273 Iteration 3: log pseudolikelihood = -401.30219 Iteration 4: log pseudolikelihood = -401.30219

Probit regression

Number of obs

= 753

Wald chi2 (7)

= 185.10

Prob > chi2

= 0.0000

Log pseudolikelihood = -401.30219

Pseudo R2

= 0.2206

inlf

Coef.

Robust Std. Err.

z

P>|z|

[95% Conf. Interval]

nwifeinc

-.0120237

.0053106

-2.26

0.024

-.0224323 -.0016152

kidslt6

-.8683285

.1162037

-7.47

0.000

-1.096084 -.6405735

kidsge6

.036005

.0452958

0.79

0.427

-.0527731 .124783

age

-.0528527

.0083532

-6.33

0.000

-.0692246 -.0364807

educ

.1309047

.0258192

5.07

0.000

.0803 .1815095

exper

.1233476

.0188537

6.54

0.000

.086395 .1603002

expersq

-.0018871

.0006007

-3.14

0.002

-.0030645 -.0007097

_cons

.2700768

.505175

0.53

0.593

-.7200481 1.260202

. mfx

Marginal effects after probit

y = Pr(inlf) (predict)

= .58154201

variable

dy/dx

Std. Err.

z

P>|z|

[95% C. I.]

X

nwifeinc

-.0046962

.00208

-2.26

0.024

-.008766 -.000626

20.129

kidslt6

-.3391514

.04565

-7.43

0.000

-.428628 -.249675

.237716

kidsge6

.0140628

.01769

0.80

0.427

-.020603 .048729

1.35325

age

-.0206432

.00327

-6.31

0.000

-.027056 -.014231

42.5378

educ

.0511287

.01011

5.06

0.000

.031308 .07095

12.2869

exper

.0481771

.00739

6.52

0.000

.033694 .06266

10.6308

expersq

-.0007371

.00024

-3.14

0.002

-.001198 -.000276

178.039

margeff

Average partial effects after probit y = Pr(inlf)

Variable

Coef.

Std. Err.

z P>|z|

[95% Conf. Interval]

nwifeinc

-.0036162

.0015759

-2.29 0.022

-.0067049 -.0005275

kidslt6

-.2441788

.0257356

-9.49 0.000

-.2946198 -.1937379

kidsge6

.0108274

.0135967

0.80 0.426

-.0158217 .0374765

age

-.0158917

.0023447

-6.78 0.000

-.0204873 -.011296

educ

.0393088

.0073669

5.34 0.000

.02487 .0537476

exper

.037046

.0051959

7.13 0.000

.0268621 .0472299

expersq

-.0005675

.0001775

-3.20 0.001

-.0009154 -.0002197

. dprobit inlf nwifeinc kidslt6 kidsge6 ‘control’ ‘E’, r

Iteration 0: log pseudolikelihood = -514.8732

Iteration 1: log pseudolikelihood = -405.78215

Iteration 2: log pseudolikelihood = -401.32924

Iteration 3: log pseudolikelihood = -401.30219

Iteration 4: log pseudolikelihood = -401.30219

Probit regression, reporting marginal effects

Number of obs = 753

Wald chi2 (7) = 185.10

Prob > chi2 = 0.0000

Log pseudolikelihood = —401.30219

Pseudo R2 = 0.2206

Robust

inlf

dF/dx

Std. Err.

z P>|z|

x-bar [ 95% C. I. ]

nwifeinc

-.0046962

.0020767

-2.26 0.024

20.129 -.008766 -.000626

kidslt6

-.3391514

.045652

-7.47 0.000

.237716 -.428628 -.249675

kidsge6

.0140628

.0176869

0.79 0.427

1.35325 -.020603 .048729

age

-.0206432

.0032717

-6.33 0.000

42.5378 -.027056 -.014231

educ

.0511287

.010113

5.07 0.000

12.2869 .031308 .07095

exper

.0481771

.0073896

6.54 0.000

10.6308 .033694 .06266

expersq

-.0007371

.000235

-3.14 0.002

178.039 -.001198 -.000276

obs. P

.5683931

pred. P

.581542

(at x-bar)

z and P> |z| correspond to the test of the underlying coefficient being 0

. estat classification

Probit model for inlf

True-

Classified

D

Total

+

348

120

468

80

205

285

Total

428

325

753

Classified + if predicted Pr(D) >= .5 True D defined as inlf!= 0

Sensitivity

Specificity

Positive predictive value Negative predictive value

Pr(+| D) Pr(-| ~D) Pr(D| +) Pr(-D| -)

81.31%

63.08%

74.36%

71.93%

False + rate for true —D

Pr(+| ~D)

36.92%

False – rate for true D

Pr(-| D)

18.69%

False + rate for classified +

Pr(-D| +)

25.64%

False – rate for classified –

Pr(D| -)

28.07%

Correctly classified

73.44%

d. Wooldridge (2009, Chapter 17) recommends one obtain the estimates of (fi/a2) from a probit using an indicator of labor force participation. Then comparing those with the Tobit estimates generated by dividing fi by a2. If these estimates are different or have different signs, then the Tobit esti­mation may not be appropriate. Part (c) gave such probit estimates. For (kidslt6) this was estimated at —0.868. From part (b) the tobit estimation gave a fi estimate for (kidslt6) of —894 and an estimate of a2 of 1122. The resulting estimate of (fi/a2) is —0.797. These have the same sign but with different magnitudes.

13.18 Heckit Estimation of Married Women’s Earnings

a. OLS on this model yields

. reg Iwage educ exper expersq

Source

I SS

df

MS

Number of obs

= 428

F( 3, 424)

= 26.29

Model

35.0222967

3

11.6740989

Prob > F

= 0.0000

Residual

188.305144

424

.444115906

R-squared

= 0.1568

Adj R-squared

= 0.1509

Total

| 223.327441

427

.523015084

Root MSE

= .66642

Iwage

Coef.

Std. Err.

t

P>|t|

[95% Conf. Interval]

educ

.1074896

.0141465

7.60

0.000

.0796837

.1352956

exper

.0415665

.0131752

3.15

0.002

.0156697

.0674633

expersq

-.0008112

.0003932

-2.06

0.040

-.0015841

-.0000382

_cons

-.5220406

.1986321

-2.63

0.009

-.9124667

-.1316144

Heckman two-step estimates

. heckman lwage educ exper expersq, select ( educ exper expersq age kids

kidsge6 nwifeinc) twostep

Heckman selection model –

two-step estimates

Number of obs

= 753

(regression model with sample selection)

Censored obs

= 325

Uncensored obs

= 428

Wald chi2(3)

= 51.53

Prob > chi2

= 0.0000

lwage

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

lwage

educ

.1090655

.015523

7.03

0.000

.0786411

.13949

exper

.0438873

.0162611

2.70

0.007

.0120163

.0757584

expersq

-.0008591

.0004389

-1.96

0.050

-.0017194

1.15e-06

_cons

-.5781032

.3050062

-1.90

0.058

-1.175904

.019698

select

educ

.1309047

.0252542

5.18

0.000

.0814074

.180402

exper

.1233476

.0187164

6.59

0.000

.0866641

.1600311

expersq

-.0018871

.0006

-3.15

0.002

-.003063

-.0007111

age

-.0528527

.0084772

-6.23

0.000

-.0694678

-.0362376

kidslt6

-.8683285

.1185223

-7.33

0.000

-1.100628

-.636029

kidsge6

.036005

.0434768

0.83

0.408

-.049208

.1212179

nwifeinc

-.0120237

.0048398

-2.48

0.013

-.0215096

-.0025378

_cons

.2700768

.508593

0.53

0.595

-.7267473

1.266901

mills

lambda

.0322619

.1336246

0.24

0.809

-.2296376

.2941613

rho

0.04861

sigma

.66362875

lambda

.03226186

.1336246

b. The inverse mills ratio coefficient lambda is estimated to be.032 with a standard error of 0.134 which is not significant. This does not reject the

null hypothesis of no sample selection.

c. The MLE of this Heckman (1976) sample selection model.

. heckman Iwage educ exper expersq, select (educ exper expersq age kidslt6 kidsge6 nwifeinc)

Iteration 0: log likelihood = -832.89776 Iteration 1: log likelihood = -832.88509 Iteration 2: log likelihood = -832.88508

Heckman selection model

Number of obs =

753

(regression model with sample selection)

Censored obs =

325

Uncensored obs =

428

Wald chi2(3) =

59.67

Log likelihood = -832.8851

Prob > chi2 =

0.0000

lwage

Coef.

Std. Err.

z

P>|z|

[95% Conf. Interval]

lwage

educ

.1083502

.0148607

7.29

0.000

.0792238

.1374767

exper

.0428369

.0148785

2.88

0.004

.0136755

.0719983

expersq

-.0008374

.0004175

-2.01

0.045

-.0016556

-.0000192

_cons

-.5526973

.2603784

-2.12

0.034

-1.06303

-.0423651

select

educ

.1313415

.0253823

5.17

0.000

.0815931

.1810899

exper

.1232818

.0187242

6.58

0.000

.0865831

.1599806

expersq

-.0018863

.0006004

-3.14

0.002

-.003063

-.0007095

age

-.0528287

.0084792

-6.23

0.000

-.0694476

-.0362098

kidslt6

-.8673988

.1186509

-7.31

0.000

-1.09995

-.6348472

kidsge6

.0358723

.0434753

0.83

0.409

-.0493377

.1210824

nwifeinc

-.0121321

.0048767

-2.49

0.013

-.0216903

-.002574

_cons

.2664491

.5089578

0.52

0.601

-.7310898

1.263988

/athrho

.026614

.147182

0.18

0.857

-.2618573

.3150854

/lnsigma

-.4103809

.0342291

-11.99

0.000

-.4774687

-.3432931

rho

sigma

lambda

.0266078

.6633975

.0176515

.1470778

.0227075

.0976057

-.2560319

.6203517

-.1736521

.3050564

.7094303

.2089552

LR test of indep. eqns. (rho

= 0):

chi2(1) =

0.03 Prob > chi2 =

0.8577

This yields the same results as the two-step Heckman procedure and the LR test for (rho = 0) is not significant.

References

Carrasco, R. (2001), “Binary Choice with Binary Endogenous Regressors in Panel Data: Estimating the Effect Fertility on Female Labor Participation,” Journal of Business & Economic Statistics, 19: 385-394.

Dhillon, U. S., J. D. Shilling and C. F. Sirmans (1987), “Choosing Between Fixed and Adjustable Rate Mortgages,” Journal of Money, Credit and Banking, 19: 260-267.

Heckman, J. (1976), “The Common Structure of Statistical Models of Truncation, Sample Selection, and Limited Dependent Variables and a Simple Estimator for Such Models,” Annals of Economic and Social Measurement, 5: 475-492.

Mullahy, J. and J. Sindelar (1996), “Employment, Unemployment, and Problem Drinking,” Journal of Health Economics, 15: 409-434.

Terza, J. (2002), “Alcohol Abuse and Employment: A Second Look,” Journal of Applied Econometrics, 17: 393-404.

Wooldridge, J. M. (2009), Introductory Econometrics: A Modern Approach (South­Western: Ohio).

CHAPTER 14

Leave a reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>