# Regression Diagnostics and Specification Tests

8.1 Since H = PX is idempotent, it is positive semi-definite with b0H b > 0 for any arbitrary vector b. Specifically, for b0 = (1,0,.., 0/ we get hn > 0. Also, H2 = H. Hence,

n

hii =J2 hb2 > h2i > 0.

j=i

From this inequality, we deduce that hjy — h11 < 0 or that h11(h11 — 1/ < 0. But h11 > 0, hence 0 < h11 < 1. There is nothing particular about our choice of h11. The same proof holds for h22 or h33 or in general hii. Hence, 0 < hii < 1 for i = 1,2,.., n.

8.2 A Simple Regression With No Intercept. Consider yi = xi" + ui for i = 1,2,.., n

a. H = Px = x(x0x)_1x0 = xx0/x0x since x0x is a scalar. Therefore, hii =

n

x2/ x2 for i = 1,2,.., n. Note that the xi’s are not in deviation form as

i=1

in the case of a simple regression with an intercept. In this case tr(H/ =

n

tr(Px/ = tr(xx0//x0x = tr(x0x//x0x = x0x/x0x = 1. Hence, hii = 1.

n

xj 2 — xi2

j=1

n

From(8.18), (n-2)s(i/2 = (n—1/s2 — = (n—1/s2 — e2 I ^ x’2/xj2 I.

/V /V x^e’ / n 1/2

From (8.19), DFBETASi=("—"«//sp/v’C^x)3^-^ • E xi2 /sa).

2^xj i=1 /

c. From (8.21), DFFITi = y — yw = x0[" — "(i/] =

B. H. Baltagi, Solutions Manual for Econometrics, Springer Texts in Business and Economics, DOI 10.1007/978-3-642-54548-1_8, © Springer-Verlag Berlin Heidelberg 2015

det [X0X] = x0x = X x2 and (1 – hii) = 1 -( x2 E x2^ = E x2 / E x2.

i—1 V i—1 J j^i 1 i—1

The last term is the ratio of the two determinants. Rearranging terms, one

can easily verify (8.27). From (8.26),

8.3 From(8.17)s;?i) = E ^yt — xt0"(i^ substituting (8.13), one gets

where hit = xt,(X, X) 1xi. Adding and subtracting the i-th term of this summation, yields

8.4 Obtaining ei* from an Augmented Regression

a. In order to get "* from the augmented regression given in (8.5), one can premultiply by Pdi as described in (8.14) and perform OLS. The Frisch – Waugh-Lovell Theorem guarantees that the resulting estimate of "* is the same as that from (8.5). The effect of Pdi is to wipe out the i-th observation from the regression and hence " * = "(i) as required.

b. In order to get ® from (8.5), one can premultiply by Px and perform OLS on the transformed equation Pxy = Pxdi® + Pxu. The Frisch-Waugh – Lovell Theorem guarantees that the resulting estimate of ® is the same as that from (8.5). OLS yields ® = (d/Pxd^ 1 d/Pxy. Using the fact that e = Pxy, d/e = ei and di0Pxdi = di0Hdi = hii one gets ф = ei/(1 — hii) as required.

c. The Frisch-Waugh-Lovell Theorem also states that the residuals from (8.5) are the same as those obtained from (8.14). This means that the i-th obser­vation residual is zero. This also gives us the fact that ф = yi — xi0"(i) = the forecasted residual for the i-th observation, see below (8.14). Hence the RSS from (8.5) = P (yt — x/"рЛ since the i-th observation contributes

a zero residual. As in (8.17) and (8.18) and problem 8.3 one can substitute (8.13) to get

®ф*

using the fact that PxX = 0. Hence " * = (X0X) 1X0y = "ols and

b. ф* = (DpPxD^) _ DpPxy = (DpPxD^ _ Dpe = (bp^Dp) _ ep since e = Pxj and Dpe = ep

c. The residuals are given by

y — X"ols — PxDp^DpPxD^ 1 ep = e — PxDp (DpPxDp) ^ so that the residuals sum of squares is

e0e + ep(DpPxDp)-1DpPxD^DpPxD^)_ ep — 2e0PxDp (DpPxDp)_.

since Pxe = e and e0Dp = ep. From the Frisch-Waugh-Lovell Theorem, (8.6) has the same residuals sum of squares as (8.6) premultiplied by Px, i. e., Pxy = PxDp®* + Pxu. This has also the same residuals sum of squares

as the augmented regression in problem 8.5, since this also yields the above regression when premultiplied by Px.

d. From (8.7), the denominator residual sum of squares is given in part (c). The numerator residual sum of squares is e0e—the RSS obtained in part (c). This yields ep (DpPxDp) ep as required.

e. For problem 8.4, consider the augmented regression y = X"* + Pxdi® + u. This has the same residual sum of squares by the Frisch-Waugh-Lovell Theorem as the following regression:

Pxy = Pxdi® + Pxu

This last regression also have the same residual sum of squares as y = X"* + di® + u

Hence, using the first regression and the results in part (c) we get Residual Sum of Squares = (Residual Sum of Squares with di deleted)

■e, di (d0Pxdi) 1 d0e

when Dp is replaced by di. But d0e = ei and d0Pxdi = 1 — hii, hence this last term is e?/(1 — hii) which is exactly what we proved in problem 8.4, part (c). The F-statistic for ® = 0, would be exactly the square of the t-statistic in problem 8.4, part (d).

8.6 Let A = X0X and a = b = x/. Using the updated formula

Note that X0 = [x1,.., xn] where Xi is kxl, so that

n

where hii = xi,(X, X) 1xi. This verifies (8.12). Note thatX0y = xjYj and

j=i

X(i/y(i) = X) xjyj = X0y – xiyi.

j^i

Therefore, post-multiplying ^X(i/X(i^ by X^y^) we get

Л, -i – , , ,_i (X0X)-1 xix0"

" (i) = (X'(i)X(i^ X(i/y(i) = "- (X0X) xiyi

(X0X)-1 xix0 (X0X)-1 xiyi

1- hii

and

– – (X0X)-1xix0" /1 – hii C hiA rv/v,-1 x y

"-"(i) = H 1 — hii J(XX xiyi

(X’X)-1 xi (y – xi") (X0X)-1 xiei

which is identical.

8.8 det |X0(i)X(i)l = det [X0X – xix0] = det [{Ik – xix0(X0X)-1} X0X]. Let a = xi and b0 = x0(X0X)-1. Using det[Ik – ab0] = 1 – b0a one gets

det [Ik – xix0(X0X)-1] = 1 – x0(X0X)-1xi = 1 – hii. Using det(AB) = det(A)

8.9 The cigarette data example given in Table 3.2.

a. Tables 8.1 and 8.2 were generated by SAS with PROC REG asking for all the diagnostic options allowed by this procedure.

b. For the New Hampshire (NH) observation number 27 in Table 8.2, the lever­age h^H = 0.13081 which is larger than 2h = 2k/n = 0.13043. The internally studentized residual eNH is computed from (8.1) as follows: eNH 0.15991

sV 1 – hNH 0.16343V1 – 0.13081

The externally studentized residual eNH is computed from (8.3) as follows: eNH 0.15991

s<NH)VT=hNS 0.16323^/1 – 0.13081

where s(2NH) is obtained from (8.2) as follows:

(n – k)s2 – eNn/(1 – hNH)

42

both Єкн and e*NH) are less than 2 in absolute value. From (8.13), the change in the regression coefficients due to the omission of the NH obser­vation is given by " – "(nh) = (X, X)_1xNHeNH/(1 – hNH) where (X’X)-1 is given in the empirical example and xJNH = (1,0.15852, 5.00319) with eNH = 0.15991 and hNH = 0.13081. This gives (" – "(NH))/ = (-0.3174, -0.0834, 0.0709). In order to assess whether this change is large or small, we compute DFBETAS given in (8.19). For the NH observation, these are given by

" 1 – |° 1 ,(nh) _ -0.3174

s(nh)T(XX)^ 0.163235 л/3°9298І69

Similarly, DFBETAS^ = -0.2573 and DFBETASNH,3 = 0.3608. These are not larger than 2 in absolute value. However, DFBETASNH,1 and DFBETASnh,3 are both larger than 2/pn = 2/V46 = 0.2949 in absolute value.

The change in the fit due to omission of the NH observation is given by (8.21). In fact,

DFFITnh = y NH — y(NH) = xNH[" — " (NH)]

/-0.3174

or simply

hNHeNH (0.13081/(0.15991)

1 — hNH (1 — 0.13081)

Scaling it by the variance of y(NH) we get from (8.22)

This is not larger than the size adjusted cutoff of 2/Уk/n = 0.511. Cook’s distance measure is given by (8.25) and for NH can be computed as follows:

COVRATIO omitting the NH observation can be computed from (8.28) as

= 1.1422

which means that |COVRATIONH — 1| = 0.1422 is less than 3k/n = 0.1956. Finally, FVARATIO omitting the NH observation can be computed from (8.29) as

c. Similarly, the same calculations can be obtained for the observations of the states of AR, CT, NJ and UT.

d. Also, the states of NV, ME, NM and ND.

8.10 The Consumption—Income data given in Table 5.3.

The following Stata output runs the ols regression of C on Y and generates the influence diagnostics one by one. In addition, it highlights the observations with the diagnostic higher than the cut off value recommended in the text of the chapter.

 regcy Source SS df MS Number of obs F(1,47) Prob > F R-squared Adj R-squared Root MSE = 49 = 7389.28 = 0.0000 = 0.9937 = 0.9935 = 437.63 Model Residual 1.4152e+09 9001347.76 1 47 1.4152e+09 191518.037 Total 1.4242e+09 48 29670457.8

 c Coef. Std. Err. t P> |t| [95% Conf. Interval] y _cons .979228 -1343.314 .0113915 219.5614 85.96 -6.12 0.000 0.000 .9563111 1.002145 -1785.014 -901.6131

. predict h, hat

. predict e if e(sample), residual. predict est1 ife(sample), rstandard. predict est2 if e(sample), rstudent. predict dfits, dfits. predict covr, covratio. predict cook if e(sample), cooksd. predict dfbet, dfbeta(y)

. list e est1 est2 cook h dfits covr, divider

 e est1 est2 cook h dfits covr 1. 635.4909 1.508036 1.529366 .0892458 .0727745 .4284583 1.019567 2. 647.5295 1.536112 1.55933 .0917852 .0721806 .4349271 1.015024 3. 520.9777 1.234602 1.241698 .0575692 .0702329 .3412708 1.051163 4. 498.7493 1.179571 1.184622 .0495726 .0665165 .3162213 1.053104 5. 517.4854 1.222239 1.228853 .0510749 .064003 .3213385 1.045562 6. 351.0732 .8263996 .8235661 .0208956 .0576647 .2037279 1.075873 7. 321.1447 .7538875 .7503751 .0157461 .0525012 .1766337 1.075311 8. 321.9283 .7540409 .7505296 .0144151 .0482588 .1690038 1.070507 9. 139.071 .3251773 .3220618 .0024888 .0449572 .0698759 1.08818 10. 229.1068 .5347434 .5306408 .0061962 .0415371 .1104667 1.07598 11. 273.736 .6382437 .6341716 .0083841 .0395361 .1286659 1.068164 12. 17.04481 .0396845 .0392607 .0000301 .0367645 .0076702 1.083723 13. -110.857 -.25773 -.2551538 .0011682 .0339782 -.047853 1.077618 14. .7574701 .0017584 .0017396 4.95e-08 .0310562 .0003114 1.077411 15. -311.9394 -.7226375 -.7189135 .0072596 .0270514 -.1198744 1.049266 16. -289.1532 -.6702327 -.6662558 .006508 .0281591 -.1134104 1.053764 17. -310.0611 -.7183715 -.7146223 .0072371 .0272825 -.119681 1.049793 18. -148.776 -.3443774 -.3411247 .0015509 .0254884 -.0551685 1.065856 19. -85.67493 -.1981783 -.1961407 .0004859 .0241443 -.0308519 1.067993 20. -176.7102 -.4084195 -.4047702 .0019229 .0225362 -.0614609 1.060452 21. -205.995 -.4759794 -.4720276 .0025513 .022026 -.0708388 1.057196 22. -428.808 -.9908097 -.9906127 .0110453 .0220072 -.1485998 1.023316 23. -637.0542 -1.471591 -1.490597 .0237717 .0214825 -.2208608 .9708202 24. -768.879 -1.77582 -1.818907 .034097 .0211669 -.267476 .928207 25. -458.3625 -1.058388 -1.059773 .0118348 .0206929 -.1540507 1.015801 26. -929.7892 -2.146842 -2.23636 .0484752 .020602 -.3243518 .8671094 27. -688.1302 -1.589254 -1.616285 .0272016 .0210855 -.2372122 .9548985 28. -579.1982 -1.338157 -1.349808 .019947 .0217933 -.2014737 .9874383 29. -317.75 -.734245 -.7305942 .0061013 .0221336 -.1099165 1.043229 30. -411.8743 -.9525952 -.9516382 .0111001 .0238806 -.1488479 1.028592 31. -439.981 -1.01826 -1.018669 .0133716 .0251442 -.1635992 1.02415 32. -428.6367 -.9923085 -.9921432 .0130068 .0257385 -.1612607 1.027102 33. -479.2094 -1.109026 -1.111808 .0158363 .0251048 -.1784142 1.015522 34. -549.0905 -1.271857 -1.280483 .0222745 .0268017 -.2124977 1.000132 35. -110.233 -.2553027 -.2527474 .0008897 .0265748 -.041761 1.069479 36. 66.3933 .1538773 .1522699 .0003404 .0279479 .0258193 1.072884 37. 32.47656 .0753326 .0745313 .0000865 .0295681 .0130097 1.075499 38. 100.64 .2336927 .2313277 .0008919 .031631 .0418084 1.075547 39. 122.4207 .2847169 .2819149 .001456 .0346758 .0534312 1.077724 40. -103.4364 -.2414919 -.2390574 .0012808 .0420747 -.0501011 1.087101
 year c est2 dfbet h dfits 1. 1959 8776 1.529366 -.3634503 .0727745 .4284583 2. 1960 8837 1.55933 -.3683456 .0721806 .4349271 47. 2005 26290 1.955164 .4722213 .0744023 .5543262 48. 2006 26835 1.611164 .4214261 .0831371 .4851599 49. 2007 27319 1.562965 .4323753 .0900456 .4916666 . list year c est2 dfbet h covr if abs(covr-1) >(3*(2/49)) year c est2 dfbet h covr 26. 1984 16343 -2.23636 -.0314578 .020602 .8671094 The Gasoline data used in Chap. 10. The following SAS output gives the
 diagnostics for two countries: Austria and Belgium.

a. AUSTRIA:

 Dep Var Predict Std Err Lower95% Upper95% Lower95% Upper95% Obs Y Value Predict Mean Mean Predict Predict 1 4.1732 4.1443 0.026 4.0891 4.1996 4.0442 4.2445 2 4.1010 4.1121 0.021 4.0680 4.1563 4.0176 4.2066 3 4.0732 4.0700 0.016 4.0359 4.1041 3.9798 4.1602 4 4.0595 4.0661 0.014 4.0362 4.0960 3.9774 4.1548 5 4.0377 4.0728 0.012 4.0469 4.0987 3.9854 4.1603 6 4.0340 4.0756 0.013 4.0480 4.1032 3.9877 4.1636 7 4.0475 4.0296 0.014 3.9991 4.0601 3.9407 4.1185 8 4.0529 4.0298 0.016 3.9947 4.0650 3.9392 4.1205

 9 4.0455 4.0191 0.019 3.9794 4.0587 3.9266 4.1115 10 4.0464 4.0583 0.015 4.0267 4.0898 3.969 4.1475 11 4.0809 4.1108 0.016 4.0767 4.145 4.0206 4.2011 12 4.1067 4.1378 0.022 4.0909 4.1847 4.042 4.2336 13 4.128 4.0885 0.015 4.0561 4.121 3.9989 4.1782 14 4.1994 4.1258 0.022 4.0796 4.1721 4.0303 4.2214 15 4.0185 4.027 0.017 3.9907 4.0632 3.9359 4.118 16 4.029 3.9799 0.016 3.9449 4.0149 3.8893 4.0704 17 3.9854 4.0287 0.017 3.9931 4.0642 3.9379 4.1195 18 3.9317 3.9125 0.024 3.8606 3.9643 3.8141 4.0108 19 3.9227 3.9845 0.019 3.9439 4.025 3.8916 4.0773

Rstudent

0.9821

-0.3246

0.0853

-0.1746

-0.9387

-1.1370

0.4787

0.6360

0.7554

-0.3178

-0.8287

-0.9541

1.1008

2.6772

-0.2318

1.4281

-1.2415

0.6118

-1.9663

X3

Dfbetas

-0.5587

0.1009

0.0008

-0.0118

-0.0387

-0.1219

0.1286

0.2176

0.3220

-0.0607

0.1027

0.2111

-0.1143

-1.2772

0.0407

0.1521

0.1923

0.2859

0.0226

b. BELGIUM:

 Dep Var Predict Std Err Lower95% Upper95% Lower95% Upper95% Obs Y Value Predict Mean Mean Predict Predict 1 4.1640 4.1311 0.019 4.0907 4.1715 4.0477 4.2144 2 4.1244 4.0947 0.017 4.0593 4.1301 4.0137 4.1757 3 4.0760 4.0794 0.015 4.0471 4.1117 3.9997 4.1591 4 4.0013 4.0412 0.013 4.0136 4.0688 3.9632 4.1192 5 3.9944 4.0172 0.012 3.9924 4.0420 3.9402 4.0942 6 3.9515 3.9485 0.015 3.9156 3.9814 3.8685 4.0285 7 3.8205 3.8823 0.017 3.8458 3.9189 3.8008 3.9639 8 3.9069 3.9045 0.012 3.8782 3.9309 3.8270 3.9820 9 3.8287 3.8242 0.019 3.7842 3.8643 3.7410 3.9074 10 3.8546 3.8457 0.014 3.8166 3.8748 3.7672 3.9242 11 3.8704 3.8516 0.011 3.8273 3.8759 3.7747 3.9284 12 3.8722 3.8537 0.011 3.8297 3.8776 3.7769 3.9304 13 3.9054 3.8614 0.014 3.8305 3.8922 3.7822 3.9405 14 3.8960 3.8874 0.013 3.8606 3.9142 3.8097 3.9651 15 3.8182 3.8941 0.016 3.8592 3.9289 3.8133 3.9749 16 3.8778 3.8472 0.013 3.8185 3.8760 3.7689 3.9256 17 3.8641 3.8649 0.016 3.8310 3.8988 3.7845 3.9453 18 3.8543 3.8452 0.014 3.8163 3.8742 3.7668 3.9237 19 3.8427 3.8492 0.028 3.7897 3.9086 3.7551 3.9433

 Std Err Student Cook’s Obs Residual Residual Residual -2-1-01 2 D Rstudent 1 0.0329 0.028 1.157 ** 0.148 1.1715 2 0.0297 0.030 0.992 * 0.076 0.9911 3 -0.00343 0.031 -0.112 0.001 -0.1080 4 -0.0399 0.032 -1.261 ** 0.067 -1.2890 5 -0.0228 0.032 -0.710 * 0.016 -0.6973 6 0.00302 0.031 0.099 0.001 0.0956 7 -0.0618 0.030 -2.088 **** 0.366 -2.3952 8 0.00236 0.032 0.074 0.000 0.0715 9 0.00445 0.029 0.156 0.003 0.1506 10 0.00890 0.031 0.284 0.004 0.2748 11 0.0188 0.032 0.583 * 0.011 0.5700 12 0.0186 0.032 0.575 * 0.010 0.5620 13 0.0440 0.031 1.421 ** 0.110 1.4762 14 0.00862 0.032 0.271 0.003 0.2624 15 -0.0759 0.030 -2.525 ***** 0.472 -3.2161 16 0.0305 0.031 0.972 * 0.043 0.9698 17 -0.00074 0.030 -0.025 0.000 -0.0237 18 0.00907 0.031 0.289 0.004 0.2798 19 -0.00645 0.020 -0.326 0.053 -0.3160
 Hat Diag Cov INTERCEP X1 X2 X3 Obs H Ratio Dffits Dfbetas Dfbetas Dfbetas Dfbetas 1 0.3070 1.3082 0.7797 -0.0068 0.3518 0.0665 -0.4465 2 0.2352 1.3138 0.5497 0.0273 0.2039 0.1169 -0.2469 3 0.1959 1.6334 -0.0533 0.0091 -0.0199 0.0058 0.0296 4 0.1433 0.9822 -0.5272 0.1909 -0.0889 0.1344 0.2118 5 0.1158 1.3002 -0.2524 0.0948 -0.0430 0.0826 0.1029 6 0.2039 1.6509 0.0484 -0.0411 -0.0219 -0.0356 0.0061 7 0.2516 0.4458 -1.3888 0.0166 0.9139 -0.6530 -1.0734 8 0.1307 1.5138 0.0277 -0.0028 -0.0169 0.0093 0.0186 9 0.3019 1.8754 0.0990 -0.0335 -0.0873 0.0070 0.0884 10 0.1596 1.5347 0.1198 -0.0551 -0.0959 -0.0243 0.0886 11 0.1110 1.3524 0.2014 -0.0807 -0.1254 -0.0633 0.1126 12 0.1080 1.3513 0.1956 -0.0788 -0.0905 -0.0902 0.0721 13 0.1792 0.9001 0.6897 0.5169 0.0787 0.5245 0.1559 14 0.1353 1.4943 0.1038 0.0707 0.0596 0.0313 -0.0366 15 0.2284 0.1868 -1.7498 -1.4346 -1.1919 -0.7987 0.7277 16 0.1552 1.2027 0.4157 0.3104 0.1147 0.2246 0.0158 17 0.2158 1.6802 -0.0124 -0.0101 -0.0071 -0.0057 0.0035 18 0.1573 1.5292 0.1209 0.0564 0.0499 0.0005 -0.0308 19 0.6649 3.8223 -0.4451 0.1947 -0.0442 0.3807 0.1410

Sum of Residuals 0

Sum of Squared Residuals 0.0175

Predicted Resid SS (Press) 0.0289

SAS PROGRAM

Data GASOLINE;

Input COUNTRY \$ YEAR Y X1 X2 X3; CARDS;

DATA AUSTRIA; SET GASOLINE; IF COUNTRY=‘AUSTRIA’;

Proc reg data=AUSTRIA;

Model Y=X1 X2 X3 / influence p r cli clm;

RUN;

DATA BELGIUM; SET GASOLINE; IF COUNTRY=‘BELGIUM’;%

Proc reg data=BELGIUM;

Model Y=X1 X2 X3 / influence p r cli clm;

RUN;

8.13 Independence of Recursive Residual. This is based on Johnston (1984, p. 386).

a. Using the updating formula given in (8.11) with A = (X0Xt) and a = —b = x(+1,weget

(X0X. C xt+ixj+i)"1 = (X0X.):1

— (X|Xt)_1 xt+i (l C xt+i (X|Xt)_1 x.+i)^ x0+i (^Xt)_1

1 C x0+i (Х. Х.Г1 xt+i

which is exactly (8.31).

b. "t+1 = (Xt+iXt+i):1 X0+i Yt+1

Replacing X0,1 by (X0,xt+i) andYt+1 by ( : | yields

yt+1

"t+i = (X0+iXt+i) (X0Yt C xt+iyt+i)

Replacing (X0+1Xt+^ 1 by its expression in (8.31) we get

(X0Xt):1 xt+ix0+i" t

ft+i

(X0X.) 1 xt+ix0+1 (Х. Х.) 1 xt+i yt+i

ft+i

(X0Xt) xt+ixt+i"t ft+i — ft+i C 1

ft+i ft+i

= " t— (X0Xt):1 xt+i (у.+1 — x0+i" t) /ft+i where we used the fact that x.+1 (X0Xt) 1 xt+1 = ft+i — 1.

c. Using (8.30), wt+i = (yt+i – xt+1"t) /fi where ft+i = 1 + x0+1 (X(Xt) 1 xt+i. Defining vt+i = vft+1 wt+i we get vt+i = yt+i –

x0+1"t = xt+1(" — "t) + ut+1 for t = k, ..,T — 1. Since ut ~ IIN(0, o2), then wt+i has zero mean and var(wt+1) = o2. Furthermore, wt+1 are linear in the y’s. Therefore, they are themselves normally distributed. Given nor­mality of the wt+i’s it is sufficient to show that cov(wt+1, ws+1) = 0 for t ф s; t = s = k,.., T — 1. But ft+i is fixed. Therefore, it suffices to show that cov(vt+1, vs+1) = 0 for t ф s.

Using the fact that "t = (X0Xt) 1 X0Yt = " + (X0Xt) 1 Xt0ut where ut = (u1,.., ut), we get vt+i = ut+i — x0+1 (X(Xt) 1 X{ut. Therefore,

E (vt+ivs+i) = ^[u,+i – x0+1(X0Xt)-1X0u^[us+1 – x£+i (Х^Х^)-1 X^]} = E (ut+iu+i) – x0+i (X0Xt)-1 X0E (utus+i)

– E (ut+ius) Xs (Х^Х8)-1 xs+i + x0+i (Х0Х0)-1 X0E (utus) Xs (xsXs)-1 xs+i.

Assuming, without loss of generality, that t < s, we get E(ut+1us+1) = 0, E(ut+1us+1) = 0 since t < s < s + 1, E (ut+1us) = (0,.., o2, ..0) where the o2 is in the (t + 1)-th position, and 0u11

ut

Substituting these covariances in E(vt+1vs+1) yields immediately zero for the first two terms and what remains is

E (vt+ivs+i) = -02xt+i (XsXs)-1 xs+1 + O2x0+i (Х0Х,)-1 X0 [It, 0] Xs (xsXs)-1 xs+1

But [It, 0]Xs = Xt for t < s. Hence

E(vt+ivs+i) = – o2xt+i(XsXs)-1xs+i + o2xt+i(XsXs)-1 xs+1 = 0 fort Ф s.

A simpler proof using the C matrix defined in (8.34) is given in prob­lem 8.14.

8.14 Recursive Residuals are Linear Unbiased With Scalar Covariance Matrix (LUS).

a. The first element of w = Cy where C is defined in (8.34) is

Wk+i = —xk+1(XkXt) 1XkYk^/ft+1 + yk+1/VfC7

where Yk = (yi, ..,yk). Hence, using the fact that "k = (XkXk) 1 XkYk, we get wk+1 = (yk+1 — xk+1"^ /fn which is (8.30) for t = k. Similarly, the t-th element of w = Cy from (8.34) is wt = —x( (Xt_1Xt_^ 1 X{_ 1Yt_1 Д/ft + yt/vf where Y(_ 1 = (y1,.., yt_1). Hence, using the fact that"t_1 = (X(_1Xt_1)_1 X(_1Yt_1,wegetwt = (yt — x("t_^/pft which is (8.30) for t = t — 1. Also, the last term can be expressed in the same way.

wt = —xT(XT_1Xt_1)_1XT_1Yt_1/pfT+Ут/PfT = (yT — xT"t_i )/pfT which is (8.30) for t = T — 1.

b. The first row of CX is obtained by multiplying the first row of C by

Xk

X

x

This yields —xk+1 (XkXk) 1 XkXk^fk+7 + xk+1/Vfk+7 = °.

Similarly, the t-th row of CX is obtained by multiplying the t-th row of C by

This yields — X (X(_1Xt_^ 1 X(_1Xt_1 Д/f) + x( / Vf = 0.

Also, the last row of CX is obtained by multiplying the T-th row of C by

X = [Xt,_1 xT

This yields —xT (XT_1Xr_1) 1 XT_1Xr_1/pfT C xT/pfT = 0.

This proves that CX = 0. This means that w = Cy = CX" + Cu = Cu since CX = 0.

Hence, E(w) = CE(u) = 0. Note that

0 0 .. so that the first diagonal element of CC0 is

xk+1(XkXk)-1XkXk(XkXk)-1xk+1 1 1 + xk+1(XkXk)-1 xk+1

fk+1 fk+1 fk+1

similarly, the t-th diagonal element of CC0 is x0(X0-1 Xt-1)-1 X0-1Xt-1(X0-1Xt-1)-1xt 1 ft

ft

and the T-th diagonal element of CC0 is

xT(XT-1Xt-1)-1XT-1Xt-1(XT-1Xt-1)-1xt 1 ^ fp

fr

Using the fact that

Xk

Xt-1 =

t-1.

one gets the (1,t) element of CC0 by multiplying the first row of C by the t-th column of C0. This yields

xk+1(XkXk)-1XkXk(X(_1Xt-1)-1 xt _ xk+1(X(_1Xt-1)-1xt = 0 pfk+1 pft pfk+1 pft

Similarly, the (1,T) element of CC0 is obtained by multiplying the first row of C by the T-th column of C0. This yields xk+1 (XkXk) -1 XkXk (XT-1 Xt – 1) -1 xr xk+1(XT-1XT-1)-1xr

-/fk+w0f^

Similarly, one can show that the (t,1)-th element, the (t, T)-th element, the (T,1)-th element and the (T, t)-th element of CC0 are all zero. This proves that CC0 = IT_k. Hence, w is linear in y, unbiased with mean zero, and var(w) = var(Cu) = C E(uu0)C0 = o2CC0 = o2IT_k, i. e., the variance-covariance matrix is a scalar times an identity matrix. Using Theil’s (1971) terminology, these recursive residuals are (LUS) linear unbiased with a scalar covariance matrix. A further result holds for all LUS

residuals. In fact, C0C = Px see Theil (1971, p. 208). Hence, TT

w2 = w0w = y0C0Cy = y0Pxy

t=k+i

Therefore, the sum of squares of the (T — k) recursive residuals is equal to the sum of squares of the T least squares residuals.

c. If u ~ N(0, o2IT) then w = Cu is also Normal since it is a linear function of normal random variables. From part (b) we proved that w has mean zero and variance o2IT_k. Hence, w ~ N(0, o2IT_k) as required.

d. Let us express the (t + 1) residuals as follows:

Yt+i — Xt+i" t+i = Yt+i — Xt+i" t — Xt+i(" t+i — " t) so that

RSSt+i = (Yt+i — Xt+i"t+i)0(Yt+i — Xt+i" t+i)

= (Yt+i — Xt+i" t)0(Yt+i — Xt+i" t) + (" t+i — " t )0Xt+iXt+i (" t+i — " t) — 2(" t+i — " t)0 X0+i(Yt+i — Xt+i" t)

where ft+i = 1 C x(+1 (X0Xt) 1 xt+i is defined in (8.32) and wt+1 is the recursive residual defined in (8.30). Next, we make the substitution for ("t+1 — "t) from (8.32) into the second term of RSSt+1 to get (" t+1 — " t),X0+1Xt+1(" t+1 — " t) = (yt+1 — x+1" t),x;+1(X( Xt)-1(X;+1 Xt+1)(X0Xt)-1xt+1(yt+1 — x0+1" t)

Postmultiplying (8.31) by xt+1 one gets (X0+1Xt+1)-1xt+1 = (X0Xt) 1 xt+1 — (XtX*) xt+1c

1 C c

where c = x(+1 (X(xt) 1 xt+1. Collecting terms one gets (X0+1Xt+1) xt+1 = (X0Xt) xt+1/ft+1

where ft+1 = 1 C c.

Substituting this above, the second term of RSSt+1 reduces to

x0+1 (X0Xt)-1 xt+1w2+1.

For the third term of RSSt+1, we observe that X0+^Yt+1 — X0+1"^ = [X0, xt+1] yt — X;"t * = X0(Yt — Xt"t) C xt+1(yt+1 — x]+1"t).

yt+1 xt+1"t

The first term is zero since Xt and its least squares residuals are orthogonal.

Hence, Xt+1(Yt+1 — Xt+1"t) = xt+1wt+1 ft+1 and the third term of RSSt+1 becomes — 2("t+1 — "t),xt+1wt+^/f’t+7 using (8.32), this reduces to — 2×0+1 (X£Xt) 1 xt+1w2+1. Adding all three terms yields

— 2×0+1 (X0XO 1 xt+1w0+1 = RSSt C ft+1w0!+1 — (ft+1 — 1)w0!+1 = RSSt C w0!+1 as required.

8.15 The Harvey and Collier (1977) Misspecification t-test as a Variable Additions Test. This is based on Wu (1993).

a. The Chow F-test for Ho; у = 0 in (8.44) is given by

URSS/(T — k — 1) y0P[X, z]y/(T — k — 1).

Using the fact that z = C0iT-k where C is defined in (8.34) and iT-k is a vector of ones of dimension (T — k). From (8.35) we know that CX = 0, CC0 = IT-k and C0C = Px. Hence, z0X = iT-kCX = 0 and

Therefore, URSS = y0P[X, z]y = y0y — y0Pxy — y0Pzy and RRSS = y0PXy = y0y — y0Pxy and the F-statistic reduces to

f = ___________ y0PY__________

y0(Px — Pz)y/(T — k — 1)

which is distributed as F(1, T — k — 1) under the null hypothesis Ho; у = 0. b. The numerator of the F-statistic is

y0z(z0z) 1z0y = y0C0 iT-k (iT-kCC0it-^ 1 iT-kCy

ButCC0 = IT-kand iT-kCC0iT-k = iT-kiT-k = (T—k). Therefore y0Pzy = y0C0iT-kiT_kCy/(T—k). The recursive residuals are constructed as w = Cy, see below (8.34). Hence

T

T _

where w = ^2 wt/(T — k). Using Px = C0C, we can write

t=k+1

T

y0Pxy = y0C0Cy = w0w = ^ w2

t=k+1

Hence, the denomentator of the F-statistic is y0(Px — Pz)y/(T — k — 1) =

T

J2 w2 — (T — k)w2

t=k+1

T

wheresww = ^2 (wt — w)2/(T — k — 1) was givenbelow(8.43).Therefore,

t=k+1

8.16 The Gasoline data model given in Chap. 10.

a. Using Eviews, and the data for AUSTRIA one can generate the regression

of y on X1, X2 and X3 and the corresponding recursive residuals. LS // Dependent Variable is Y Sample: 1960 1978 Included observations: 19

 Variable Coefficient Std. Error t-Statistic Prob. C 3.726605 0.373018 9.990422 0.0000 X1 0.760721 0.211471 3.597289 0.0026 X2 -0.793199 0.150086 -5.284956 0.0001 X3 -0.519871 0.113130 -4.595336 0.0004

0.001047

-0.010798

0.042785

0.035901

0.024071

-0.009944

-0.007583

0.018521

0.006881

0.003415

-0.092971

0.013540

-0.069668

0.000706

-0.070614

b. EViews also gives as an option a plot of the recursive residuals plus or minus twice their standard error. Next, the CUSUM plot is given along with 5% upper and lower lines.

Test Equation:

LS // Dependent Variable is Y Sample: 1960 1977 Included observations: 18

 Variable Coefficient! Std. Error t-Statistic Prob. C 4.000172 0.369022 10.83991 0.0000 X1 0.803752 0.194999 4.121830 0.0010 X2 ■0.738174 0.140340 -5.259896 0.0001 X3 0.522216 0.103666 -5.037483 0.0002 R-squared 0.732757 Mean dependent var 4.063917 Adjusted R-squared 0.675490 S. D. dependent var 0.063042 S. E. of regression 0.035913 Akaike info criterion -6.460203 Sum squared resid 0.018056 Schwarz criterion -6.262343 Log likelihood 36.60094 F-statistic 12.79557 Durbin-Watson stat 2.028488 Prob(F-statistic) 0.000268

8.17 The Differencing Test in a Regression with Equicorrelated Disturbances. This is based on Baltagi (1990).

a. For the equicorrelated case £2 can be written as £2 = o2(1 — p)[ET + 9JT] where ET = IT — JT, JT = JT/T and 9 = [1 + (T — 1)p]/(1 — p).

£_1 = [Et C (1/9)Jt]/o2(1 — p) i0£_1 = i0/9o2(1 — p)

and L = Et/o2(1 — p). Hence " = (X, ETX)_1X, ETY, which is the OLS estimator of ", since ET is a matrix that transforms each variable into devi­ations from its mean. That GLS is equivalent to OLS for the equicorrelated case, is a standard result in the literature.

Also, D£ = o2(1 — p)D since Di = 0 and D£D0 = o2(1 — p)DD0. Therefore, M = PD//o2(1 — p) where PD = D,(DD,)_1D. In order to show that M = L, it remains to show that PD/ = ET or equivalently that PD/ C JT = IT from the definition of ET. Note that both PD/ and JT are symmetric idempotent matrices which are orthogonal to each other. (DJt = 0 since Di = 0/. Using Graybill’s (1961) Theorem 1.69, these two properties of Pd and JT imply a third: Their sum is idempotent with rank equal to the sum of the ranks. But rank of PD/ is (T-1) and rank of JT is 1, hence their sum is idempotent of rank T, which could only be IT. This proves the Maeshiro and Wichers (1989) result, i. e., " = ", which happens to be the OLS estimator from (1) in this particular case.

b. The Plosser, Schwert and White differencing test is based on the difference between the OLS estimator from (1) and the OLS estimator from (2). But OLS on (1) is equivalent to GLS on (1). Also, part (a) proved that GLS on (1) is in fact GLS from (2). Hence, the differencing test can be based upon the difference between the OLS and GLS estimators from the differenced equation. An alternative solution is given by Koning (1992).

8.18 a. Stata performs Ramsey’s RESET test by issuing the command estat ovtest after running the regression as follows:

. reg Iwage wks south smsa ms exp exp2 occ ind union fem blk ed

(We suppress the regression output since it is the same as Table 4.1 in the text and solution 4.13).

. estat ovtest

Ramsey RESET test using powers of the fitted values of lwage Ho: model has no omitted variables F(3, 579) = 0.79 Prob>F = 0.5006

This does not reject the null of no regression misspecification.

b. Stata also performs White’s (1982) Information matrix test by issuing the command estat imtest after running the regression as follows:

. estat imtest

Cameron & Trivedi’s decomposition of IM-test

 Source I chi2 df p Heteroskedasticity | 103.28 81 0.082 Skewness | 5.37 12 0.9444 Kurtosis I ………………………. — 1.76 1 0.1844 Total | 110.41 94 0.1187

This does not reject the null, even though heteroskedasticity seems to be a problem.

8.20 a. Stata performs Ramsey’s RESET test by issuing the command estat ovtest after running the regression as follows:

. reg lnc lnrp lnrdi

(We suppress the regression output since it is the same as that in solu­tion 5.13).

. estat ovtest

Ramsey RESET test using powers of the fitted values of lnc Ho: model has no omitted variables F(3, 40) = 3.11 Prob>F = 0.0369

This rejects the null of no regression misspecification at the 5% level. b. Stata also performs White’s (1982) Information matrix test by issuing the command estat imtest after running the regression as follows:

. estat imtest

Cameron & Trivedi’s decomposition of IM-test

 Source I chi2 df p Heteroskedasticity | 15.66 5 0.0079 Skewness | 3.59 2 0.1664 Kurtosis | 0.06 1 0.8028 ………………………. Total | 19.31 8 0.0133

This rejects the null, and seems to indicate that heteroskedasticity seems to be the problem. This was confirmed using this data set in problem 5.13.

References

Graybill, F. A. (1961), An Introduction to Linear Statistical Models, Vol. 1 (McGraw – Hill: New York).

Koning, R. H. (1992), “The Differencing Test in a Regression with Equicorrelated Disturbances,” Econometric Theory, Solution 90.4.5, 8: 155-156.

Maeshiro, A. and R. Wichers (1989), “On the Relationship Between the Esti­mates of Level Models and Difference Models,” American Journal of Agricultural Economics, 71: 432-434.

White, H. (1982), “Maximum Likelihood Estimation of Misspecified Models,” Econometrica, 50: 1-25.

Wu, P. (1993), “Variable Addition Test,” Econometric Theory, Problem 93.1.2, 9: 145-146.

CHAPTER 9