# Prediction

How is prediction affected by nonspherical disturbances? Suppose we want to predict one period ahead. What has changed with a general Q? For one thing, we now know that the period (T + 1) disturbance is correlated with the sample disturbances. Let us assume that this correlation is given by the (T x 1) vector ш = E(uT+1u), problem 5 shows that the BLUP for yT+1 is

yT+1 = XT+1eGLS + W’Q 1(y — xPgls )/a2 (9.18)

The first term is as expected, however it is the second term that highlights the difference between the spherical and nonspherical model predictions. To illustrate this, let us look at the AR(1) case where cov(ut, ut-s) = psa2a. This implies that ш’ = a2U(pT, pT-1,…,p). Using Q which is given in (9.9), one can show that ш is equal to pa2u multiplied by the last column of Q. But Q-1Q = It, therefore, Q-1 times the last column of Q gives the last column of the identity matrix, i. e., (0,0,…, 1)’ . Substituting for the last column of Q its expression (ш/pa2u) one gets, Q-1^/paU) = (0, 0,…, 1)’. Transposing and rearranging this last expression, we get шХі-1 /a2U = p(0, 0,…, 1). This means that the last term in (9.18) is equal to p(0, 0,…, 1)(y — XeGLS) = peT, GLS, where eT, GLS is the T-th GLS residual. This differs from the spherical model prediction in that next year’s disturbance is not independent of the sample disturbances and hence, is not predicted by its mean which is zero. Instead, one uses the fact that uT+1 = puT + eT+1 and predicts uT+1 by peT, GLS. Only eT+1 is predicted by its zero mean but uT is predicted by eT, GLS.

9.5 Unknown ^

If Q is unknown, the practice is to get a consistent estimate of Q, say Q and substitute that in вGLS. The resulting estimator is

Pfgls = (X’n-1X )-1X’n-1y (9.19)

and is called a feasible GLS estimator of в. Once Q replaces Q the Gauss-Markov Theorem no longer necessarily holds. In other words, вFGLS is not BLUE, although it is still consistent.

The finite sample properties of 3fgls are in general difficult to derive. However, we have the following asymptotic results.

Theorem 1: y/n(3GLS — 3) and – Jn(3FGLS — 3) have the same asymptotic distribution N(0,a2Q-1), where Q = lim(X’Q-1X)/n as n ^ o, if (i) plim X'(Q-1 — Q-1)X/n = 0 and (ii) plim X'(Q-1 — Q-1)u/n = 0. A sufficient condition for this theorem to hold is that Q is a consistent estimator of Q and X has a satisfactory limiting behavior.

Lemma 1: If in addition plim u'(Q-1 — Q-1)u/n = 0, then s*2 = e’GLSQ-1eGLS/(n — K) and S*2 = e’FGLSQ-1eFGLS/(n — K) are both consistent for a2. This means that one can perform test of hypotheses based on asymptotic arguments using 3FGLS and S*2 rather than 3GLS and s*2, respectively. For a proof of Theorem 1 and Lemma 1, see Theil (1971), Schmidt (1976) or Judge et al. (1985).

Monte Carlo evidence under heteroskedasticity or serial correlation suggest that there is gain in performing feasible GLS rather than OLS in finite samples. However, we have also seen in Chapter 5 that performing a two-step Cochrane-Orcutt procedure is not necessarily better than OLS if the X’s are trended. This says that feasible GLS omitting the first observation (in this case Cochrane-Orcutt) may not be better in finite samples than OLS using all the observations.