# Economic Forecasting: A Theoretical Framework

2.1 Optimal forecasts, feasible forecasts, and forecast errors

Let yt denote the scalar time series variable that the forecaster wishes to forecast, let h denote the horizon of the forecast, and let Ft denote the set of data used at time t for making the forecast (Ft is sometimes referred to as the information set available to the forecaster). If the forecaster has squared error loss, the point forecast yt+h|t is the function of Ft that minimizes the expected squared forecast error, that is, E[(yt+h – yt+h|t)21 Ft]. This expected loss is minimized when the forecast is the conditional expectation, E(yt+h | Ft). In general, this conditional expectation might be a time varying function of Ft. However, in this chapter we will assume that the data are drawn from a stationary distribution, that is, the distribution of (ys,…, ys+T) does not depend on s (although some mention of structural breaks will be made later); then E(yt+h | Ft) is a time invariant function of Ft.

In practice, E(yt+h | Ft) is unknown and is in general nonlinear. Forecasts are constructed by approximating this unknown conditional expectation by a parametric function. This parametric function, or model, is denoted ph(Ft, 0), where 0 is a parameter vector which is assumed to lie in the parameter space 0. The "best" value of this parameter, 0O, is the value that minimizes the mean squared approximation error, E[ph(Ft, 0) – E(yt+h| Ft)]2.

Because 00 is unknown, it is typically estimated from historical data, and the estimate is denoted 0. To be concrete, suppose that Ft consists of observations on Xs, 1 < s < T, where Xs is a vector time series (which typically includes ys). Further suppose that only the first p lags of Xt are included in the forecast. Then 0 could be estimated by least squares, that is, by solving,

T-h

mn X [yt+h – fih(Ft, 9)]2.

" t=p+1

There are alternative methods for the estimation of 0. The minimization problem (27.1) uses an й-step ahead (nonlinear) least squares regression. Often an available alternative is to estimate a one-step ahead model (й = 1) by nonlinear least squares or maximum likelihood, and to iterate that model forward. The formulation (27.1) has the advantage of computational simplicity, especially for nonlinear models. Depending on the true model and the approximate model, approximation bias can be reduced by estimating the й-step ahead model (27.1). On the other hand, if the estimated model is correct, then iterating one-step ahead forecasts will be more efficient in the statistical sense. In general, the decision of whether to estimate parameters by й-step ahead or one-step ahead methods depends on the model being estimated and the type of misspecification that might be present. See Clements and Hendry (1996) for references to this literature and for simulation results comparing the two approaches.

It is useful to consider a decomposition of the forecast error, based on the various sources of that error. Let et+h, t denote the forecast error from the й-step ahead forecast of yt+fl using уі+й1і. Then,

^t+ht = Vt+й — Уі+й 11

= [yt+й – E(Уі+й I ft)] + [E(Уі+й I Ft) – MM 0o)] + [MF, 0o) – MM P)].

(27.2)

The first term in brackets is the deviation of у+й from its conditional expectation, a source of forecast error that cannot be eliminated. The second term in brackets is the contribution of model misspecification, and is the error arising from using the best parameter value for the approximate conditional expectations function. The final term arises because this best parameter value is unknown, and instead 0 is estimated from the data.

The decomposition (27.2) illustrates two facts. First, all forecasts, no matter how good, will have forecast error because of future, unknowable random events. Second, the quality of a forecasting method is therefore determined by its model approximation error and by its estimation error. These two sources of error generally entail a tradeoff. Using a flexible model with many parameters for рй can reduce model approximation error, but because there are many parameters estimation error increases.

## Leave a reply