# Time Series and Linear Regression Models

3.3 Error autocorrelation vs. temporal dependence

The classic paper of Yule (1926) had a lasting effect on econometric modeling in so far as his cautionary note that using time series data in the context of linear regression can often lead to spurious results, resulted in reducing the number of empirical studies using such data. As mentioned above, the results of Cochrane and Orcutt (1949) and Durbin and Watson (1950, 1951) were interpreted by the econometric literature as a way to sidestep the spurious regression problem raised by Yule, and legitimize the use of time series data in the context of the linear regression model. Stage 1: estimate the linear regression model:

yt = pTxt + ut, ut ~ NI(0, о2), t О T, (28.33)

and test for error autocorrelation using the Durbin-Watson test based on:

X (Q – Qt-1)2

DW(y) = ^ t, Qt = yt – STxt, S = (XTX)-1XTy.

X q2

t=1

The alternative model that gives rise to this test is the modified model: and the test is based on the hypothesis H0 : p = 0, H1 : p Ф 0. Stage 2: if the null hypothesis is not rejected, the modeler assumes that the original model (28.33) does not indicate the presence of error autocorrelation, otherwise the modeler can "cure" the problem by adopting the alternative model (28.34), which can be estimated using the Cochrane-Orcutt procedure. In the context of the traditional Gauss-Markov specification in matrix notation:

y = XP + u, y : T x 1, X : T x k,

1g E(u) = 0, 2g-3g E(uu’) = о2IT 4g xt is fixed, 5g Rank(X) = k, (T > k),

the presence of error autocorrelation takes the form 3g E(uuT) = VT Ф о 2IT. Under E(uuT) = VT, (i. e. E(utus) Ф 0, t Ф s, t, s = 1, 2,…, T) the OLS estimator S = (XTX)-1XTy is no longer best, linear unbiased estimator (BLUE); it is said to retain its unbiasedness and consistency but forfeit its relative efficiency since:

cov( S) = (XTX)-1(XTVT X)(XTX)-1 > (XTVT1X) = cov(U),

where U = (XTV-1X)-1XTV-1y is the generalized least squares (GLS) estimator. This traditional discussion has often encouraged applied econometricians to argue that the use of time series data in the context of the linear regression model with assumptions 1g-5g would forfeit only the relative efficiency of the OLS esti­mators. As argued in Spanos (1986, chs 22-23), the above suggestion is very mis­leading because the traditional textbook scenario is highly unlikely in empirical modeling. In order to see this we need to consider the above discussion from the probabilistic reduction (PR) perspective. Viewing (28.33) from this perspective reveals that the implicit parameterization of 0 := (P, о2) is:

P = cov(Xt)-1cov(Xt, y), о2 = var(yt) – cov(yt, Xt)cov(Xt)-1cov(X„ yt).

In the context of the PR specification, assumption 3g corresponds to (y1, y2,…, yT) being "temporally" independent (see Spanos, 1986, p. 373). In the context of the PR approach, respecification amounts to replacing the original reduction assumptions on the vector process {Zt, t Є T} where Zt := (yt XJt )T, i. e. (i) normality; (ii) (temporal) independence; and (iii) identically distributed, with assumptions such as (i)’ normality; (ii)’ Markov; and (iii)’ stationarity. The PR approach replaces the original conditioning information set Dt = {Xt = xt} with D* = {Xt = xt, Z°-1}, Z°-1 = {Zt-1,…, Z0}, which (under (i)’-(iii)’) gives rise to the dynamic linear regression:

yt = PTxt + aTZt-1 + et, t Є T. (28.35)

where the implicit statistical parameterization is:

P0 = cov(Xt | Zt-1)-1 [cov(Xt, yt) – cov(Xt, ZM) [cov(Zt-1)]-1 cov(Zt-1, yt)j, a 1 = cov(Z t-1 |Xt)-1 [cov(Z t-1, yt) – cov(Z t-1, Xt) [cov(Xt)j-1 cov(Xt, yt)j, о2 = var(yt) – (cov(yt, X*)) (cov(X*))-1 (cov(X*, yt)), X* = (XT, Zj-1)T.

The important point to emphasize is that the statistical parameterization of p0 is very different from that of в in the context of the linear regression model since:

во Ф cov(Xt)-1cov(Xt, yt),

unless cov(Xt, Zt-1) = 0; there is no temporal correlation between Xt and Zt-1. Given that the latter case is excluded by assumption, the only other case when the two parameterizations coincide arises in the case where the appropriate model is (28.34). In order to understand what the latter model entails let us consider how it can be nested within the dynamic linear regression model (28.35). Substituting out the error autocorrelation in the context of (28.34) yields:

yt = eTxt – рвтхм + pyt-1 + e„ | p | < 1, t Є T,

which is a special case of the dynamic linear regression model (28.35):

yt = eTxt + eTxt-i + aiyt-i + ut, t Є T,

when one imposes the so-called common factor restrictions (see Hendry and Mizon, 1978) в0а 1 + в1 = 0. As shown in Spanos (1987), these restrictions are highly unlikely in empirical modeling because their validity presupposes that all the individual components of the vector process {Zt, t Є T} are Granger noncausal and their temporal structure is "almost identical". Under this PR scenario the OLS estimator 0 = (XTX)-1XTy under the assumption that the vector process {Zt, t Є T} is "temporally" dependent is both biased and inconsistent!