A Lagged Dependent Variable Model with AR(1) Disturbances
A model with a lagged dependent variable and an autoregressive error term is estimated using instrumental variables (IV). This method will be studied extensively in Chapter 11. In short, the IV method corrects for the correlation between Yt-i and the error term by replacing Yt-i with its predicted value Yt-i. The latter is obtained by regressing Yt-i on some exogenous variables, say a set of Z’s, which are called a set of instruments for Yt-i. Since these variables are exogenous and uncorrelated with ut, YYt-i will not be correlated with ut. Suppose the regression equation is
and that at least one exogenous variable Zt exists which will be our instrument for Yt_1. Regressing Yt-i on Xt, Zt and a constant, we get
Yt_1 = Yt-l + = a1 + Cb2Zt + 0-3Xt + vt – (6.21)
Then Yt-1 = Y1 + a2Zt + Y3Xt and is independent of ut, because it is a linear combination of exogenous variables. But, Yt-1 is correlated with ut. This means that Yt is the part of Yt-1 that is correlated with ut. Substituting Yt-1 = Yt-1 + Yt in (6.20) we get
Yt = a + eYt-1 + YXt + (ut + P^t) (6.22)
Yt-1 is uncorrelated with the new error term (ut + /3Yt) because EYt_1vt = 0 from (6.21). Also, Xt is uncorrelated with ut by assumption. But, from (6.21), Xt also satisfies EXtYt = 0. Hence, Xt is uncorrelated with the new error term (ut + /3Yt). This means that OLS applied to (6.22) will lead to consistent estimates of а, в and 7. The only remaining question is where do we find instruments like Zt? This Zt should be (i) uncorrelated with ut, (ii) preferably predicting Yt-1 fairly well, but, not predicting it perfectly, otherwise Yt-1 = Yt-1. If this happens, we are back to OLS which we know is inconsistent, (iii) Ez^/T should be finite and different from zero. Recall that zt = Zt — Z. In this case, Xt-1 seems like a natural instrumental variable candidate. It is an exogenous variable which would very likely predict Yt-1 fairly well, and satisfies Ex_1/T being finite and different from zero. In other words, (6.21) regresses Yt-1 on a constant, Xt-1 and Xt, and gets Yt-1. Additional lags on Xt can be used as instruments to improve the small sample properties of this estimator. Substituting Yt_1 in equation (6.22) results in consistent estimates of the regression parameters. Wallis (1967) substituted these consistent estimates in the original equation (6.20) and obtained the residuals pt. Then he computed
Y = E L P ut_1/(T — 1)]/E T=1 u2/T] + (3/T)
where the last term corrects for the bias in p. At this stage, one can perform a Prais-Winsten procedure on (6.20) using p instead of p, see Fomby and Guilkey (1983).
An alternative two-step procedure has been proposed by Hatanaka (1974). After estimating (6.22) and obtaining the residuals ut from (6.20), Hatanaka (1974) suggests running Yt* = Yt — pYt_1 on Yt_1 = Yt-1 — pYt-2, X£ = Xt — p)Xt-1 and щ_1. Note that this is the Cochrane-Orcutt transformation which ignores the first observation. Also, p = ^ J=3 utut-1/Y^ J=3 ignores the
small sample bias correction factor suggested by Wallis (1967). Let 8 be the coefficient of wt-1, then the efficient estimator of p is given by p = p + 8. Hatanaka shows that the resulting estimators are asymptotically equivalent to the MLE in the presence of Normality.
Empirical Example: Consider the Consumption-Income data from the Economic Report of the President over the period 1959-2007 given in Table 5.3. Problem 5 asks the reader to verify that Durbin’s h obtained from the lagged dependent variable model described in (6.20) yields a value of 3.50. This is asymptotically distributed as N(0,1) under the null hypothesis of no serial correlation of the disturbances. This null is soundly rejected. The Bruesch and Godfrey test runs the regression of OLS residuals on their lagged values and the regressors in the model. This yields a TR2 = 13.37. This is distributed as x1 under the null and has a p-value of 0.0003. Therefore, we reject the hypothesis of no first-order serial correlation. Next, we estimate (6.20) using current and lagged values of income (Yt, Yt-1 and Yt_2) as a set of instruments for lagged
consumption The regression given by (6.22), yields:
Ct = —0.831 + 0.104 Ct_1 + 1.177 Yt + residuals (0.280) (0.303) (0.326)
Substituting these estimates in (6.20), one gets the residuals ut. Based on these ut’s, the Wallis (1967) estimate of p yields C = 0.907 and Hatanaka’s (1974) estimate of p yields u = 0.828. Running the Hatanaka regression gives
C = —0.142 + 0.233 C_ + 0.843 Yt* + 0.017 Ut_i + residuals (0.053) (0.083) (0.095) (0.058)
where Cf = Ct — UCt-1. The efficient estimate of p is given by u = U + 0.017 = 0.846.
6.3.1 A Lagged Dependent Variable Model with MA(1) Disturbances
Zellner and Geisel (1970) estimated the Koyck autoregressive representation of the infinite distributed lag, given in (6.10). In fact, we saw that this could also arise from the AEM, see (6.14). In particular, it is a regression with a lagged dependent variable and an MA(1) error term with the added restriction that the coefficient of Yt-1 is the same as the MA(1) parameter. For simplicity, we write
Yt = a + Yt_i + /3Xt + (ut — Aut_i) (6.23)
Let wt = Yt — ut, then (6.23) becomes
wt = a + wt_i + pXt (6.24)
By continuous substitution of lagged values of wt in (6.24) we get
wt = a(1 + A + A2 + .. + Atj i) + Atwo + e(Xt + AXt_i + .. + A iXi) and replacing wt by (Yt — ut), we get
Yt = a(1 + A + A2 + .. + Atj i) + Atwo + e(Xt + AXt_i + .. + X iXi) + ut (6.25)
knowing A, this equation can be estimated via OLS assuming that the disturbances ut are not serially correlated. Since A is not known, Zellner and Geisel (1970) suggest a search procedure over A, where 0 < A < 1. The regression with the minimum residual sums of squares gives the optimal A, and the corresponding regression gives the estimates of a, в and w0. The last coefficient wo = Yo — uo = E(Yo) can be interpreted as the expected value of the initial observation on the dependent variable. Klein (1958) considered the direct estimation of the infinite Koyck lag, given in (6.8) and arrived at (6.25). The search over A results in MLEs of the coefficients. Note, however, that the estimate of wo is not consistent. Intuitively, as t tends to infinity, At tends to zero implying no new information to estimate wo. In fact, some applied researchers ignore the variable At in the regression given in (6.25). This practice, known as truncating the remainder, is not recommended since the Monte Carlo experiments of Maddala and Rao (1971) and Schmidt (1975) have shown that even for T = 60 or 100, it is not desirable to omit At from (6.25).
In summary, we have learned how to estimate a dynamic model with a lagged dependent variable and serially correlated errors. In case the error is autoregressive of order one, we have outlined the steps to implement the Wallis Two-Stage estimator and Hatanaka’s two-step procedure. In case the error is Moving Average of order one, we have outlined the steps to implement the Zellner-Geisel procedure.