Estimation and Testing of Dynamic Models with Serial Correlation
Both the AEM and the PAM give equations resembling the autoregressive form of the infinite distributed lag. In all cases, we end up with a lagged dependent variable and an error term that is either Moving Average of order one as in (6.10), or just classical or autoregressive as in (6.17). In this section we study the testing and estimation of such autoregressive or dynamic models.
If there is a Yt_1 in the regression equation and the ut’s are classical disturbances, as may be the case in equation (6.17), then Yt_1 is said to be contemporaneously uncorrelated with the disturbance term ut. In fact, the disturbances satisfy assumptions 1-4 of Chapter 3 and E(Yt_1ut) = 0 even though E(Yt_1ut_1) = 0. In other words, Yt_1 is not correlated with the current disturbance ut but it is correlated with the lagged disturbance ut_1. In this case, as long as the disturbances are not serially correlated, OLS will be biased, but remains consistent and asymptotically efficient. This case is unlikely with economic data given that most macro time-series variables are highly trended. More likely, the ut’s are serially correlated. In this case, OLS is biased and inconsistent. Intuitively, Yt is related to ut, so Yt_1 is related to ut_1. If ut and ut_1 are correlated, then Yt_1 and ut are correlated. This means that one of the regressors, lagged Y, is correlated with ut and we have the problem of endogeneity. Let us demonstrate what happens to OLS for the simple autoregressive model with no constant
Yt = /3Yt_1 + vt / < 1 t = 1,2,…,T (6.18)
with vt = pvt_1 + et, p < 1 and et ~ IIN(0, ct2). One can show, see problem 3, that
вOLS = 1=2 YtYt_1/ 1=2 Yt_1 = в 1=2 Yt_1vt/ 1=2 Yt_1
with plim(/3OLS — /) = asymp. bias(/OLS) = p(1 — /2)/(1 +рв). This asymptotic bias is positive if р > 0 and negative if р < 0. Also, this asymptotic bias can be large for small values of / and
large values of p. For example, if p = 0.9 and в = 0.2, the asymptotic bias for в is 0.73. This is more than 3 times the value of в.
Also, Y = £Tt=2 %vt-i/Y, Tt=2 Yt-i where z7t = Yt – PolsYt-i has
plim(Y – p) = – p(1 – в2)/(1 + Рв) = – asymp. bias(poLS)
This means that if p > 0, then p would be negatively biased. However, if p < 0, then p is positively biased. In both cases, p is biased towards zero. In fact, the asymptotic bias of the D. W. statistic is twice the asymptotic bias of вOLS, see problem 3. This means that the D. W. statistic is biased towards not rejecting the null hypothesis of zero serial correlation. Therefore, if the D. W. statistic rejects the null of p = 0, it is doing that when the odds are against it, and therefore confirming our rejection of the null and the presence of serial correlation. If on the other hand it does not reject the null, then the D. W. statistic is uninformative and has to be replaced by another conclusive test for serial correlation. Such an alternative test in the presence of a lagged dependent variable has been developed by Durbin (1970), and the statistic computed is called Durbin’s h. Using (6.10) or (6.17), one computes OLS ignoring its possible bias and p from OLS residuals as shown above. Durbin’s h is given by
h = p[n/(1 – n var(coeff. of Yt-i))]l/’2. (6.19)
This is asymptotically distributed N(0,1) under null hypothesis of p = 0. If n[var(coeff. of Yt-i)] is greater than one, then h cannot be computed, and Durbin suggests running the OLS residuals et on et-i and the regressors in the model (including the lagged dependent variable), and testing whether the coefficient of et-i in this regression is significant. In fact, this test can be generalized to higher order autoregressive errors. Let ut follow an AR(p) process
Ut = piut-i + ptut—2 + .. + pput-p + C
then this test involves running et on et-i, et-2,…, et-p and the regressors in the model including Yt-i. The test statistic for H0; pi = p2 = .. = pp = 0; is TR2 which is distributed xp. This is the Lagrange multiplier test developed independently by Breusch (1978) and Godfrey (1978) and discussed in Chapter 5. In fact, this test has other useful properties. For example, this test is the same whether the null imposes an AR(p) model or an MA(p) model on the disturbances, see Chapter 14. Kiviet (1986) argues that even though these are large sample tests, the Breusch – Godfrey test is preferable to Durbin’s h in small samples.